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FOREWOl  D 


In  a  letter  under  date  of  12  December  1967,  Dr.  Charles  A.  Reynolds, 
Technical  Director  of  Edgewood  Arsenal,  issued  an  invitation  to  hold 
the  Fourteenth  Conference  on  the  Design  of  Experiments  in  Army  Research, 
Development  and  Testing  at  Edgewood  Arsenal,  Maryland.  In  his  letter, 

Dr.  Reynolds  set  the  dates  for  this  meeting  as  23-25  October  1968,  and 
he  appointed  Messrs.  Joseph  Mandelson  and  Raymond  Schnell  to  serve  as 
Co-Chairmen  on  Local  Arrangements.  These  conferences  are  sponsored  by 
the  Army  Mathematics  Steering  Committee  and  they  come  under  the  super¬ 
vision  of  the  AMSC  Subcommittee  on  Probability  and  Statistics.  Dr.  Walter 
Foster,  the  Chairman  of  this  Subcommittee,  was  happy  to  accept  this 
invitation  and  started  laying  the  groundwork  for  this  conference.  He 
and  other  members  of  the  .AMSC  would  like  to  thank  Messrs.  Mandelson  and 
Schnell,  as  well  as  many  other  employees  of  Edgewood  Arsenal,  who  helped 
to  make  the  Fourteenth  Conference  such  an  enjoyable  and  successful 
meeting. 

These  conferences  are  open  to  scientific  personnel  of  all  Gove.rnment 
agencies,  and  the  participation  on  the  program  by  staff  members  of 
various  agencies  has  been  gratifying.  In  this,  and  in  past  meetings, 
scientists  from  the  National  Bureau  of  Standards  have  contributed  a 
great  deal  to  the  tone  of  these  symposia.  It  seems  appropriate  that 
we  point  out  some  of  the  Bureau  participants  in  this  Edgewood  Arsenal 
Conference.  Dr.  Joseph  Cameron  served  as  a  member  of  the  Program 
Committee;  and  he,  along  with  Dr.  Joan  R.  Rosenblatt,  served  as 
panelists  in  several  of  the  clinical  sessions.  Messrs.  H.  H.  Ku  and 
Roy  H.  Wampler  each  presented  technical  papers.  Further,  there  was 
presented  a  paper  which  was  authored  jointly  by  David  Hogben  and  John 
Mandel.  We  are  pleased  to  be  able  to  publish  most  of  these  papers  in 
this  technical  manual. 

Those  attending  the  conference  had  the  pleasure  of  hearing  the 
following  invited  speakers  talk  on  the  topics  noted  below: 

Broadening  the  Horizons  of  Experimental  Design 
Lieutenant  General  William  B.  Bunker 
U.  S.  Army  Material  Command 

Structure  and  Classification  of  Patterns 
Professor  Rolf  E.  Bargmann 
University  of  Georgia 

Bulk  Sampling 

Professor  Acheson  J.  Duncan 
Johns  Hopkins  University 
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Time  Series 

Professor  Emanuel  Parzen 

S L anf ulu  uni v ei a i Ly 

The  keynote  speaker,  General  Bunker,  died  before  these  Proceedings 
could  be  issued.  His  passing  is  a  heavy  loss  to  the  scientific 
community,  and  to  me,  a  special  loss,  as  he  was  a  warm  personal 
friend. 

An  outstanding  feature  of  the  program  of  the  Fourteenth 
Conference  was  a  panel  on  Bulk  Sampling.  This  is  an  area  of 
statistics  of  special  interest  to  the  scientific  personnel  of 
the  host  installation.  Dr.  Walter  Foster  served  as  chairman  and 
organizer  of  this  phase  of  the  agenda.  He  selected  Professor  A.  J. 

Duncan  to  serve  as  a  Discussant  and  Advisor  to  the  following  Panel 
Members:  Henry  Ellner;  Boyd  Harshbarger;  G.  R.  Lowrimore;  Joseph 
Mandelson;  and,  V.  H.  Rechmeyer.  Another  outstanding  feature  of 
these  conferences  is  the  awarding  of  the  Wilks  Memorial  Medal.  This 
year,  it  was  my  pleasure  to  announce  that  Professor  Jerzy  Neyman,  of 
the  University  of  California  at  Berkeley,  was  selected  to  receive  the 
Fourth  Samuel  S.  Wilks  Memorial  Medal. 

Members  of  the  Army  Mathematics  Steering  Committee  think  that 
the  papers  presented  at  the  conference  have  made  valuable  contributions 
to  the  fields  of  the  design  of  experiments,  statistics,  and  reliability, 
and  have  requested  that  these  articles  be  published  in  these  Proceedings. 
They  wish  to  thank  the  many  speakers,  chairmen,  and  panelists  for  their 
help  in  conducting  this  symposium. 

The  conference  had  an  attendance  of  163  scientists,  and  50 
organizations  were  represented.  Speakers  and  panelists  came  from: 

Cornell  Aeronautical  Lab;  Duke  University;  Federal  Electric  Corpora¬ 
tion/ITT;  Hercules,  Inc.;  Johns  Hopkins  University;  National  Bureau 
of  Standards;  Stanford  University;  Thiokol  Chemical  Corporation; 
University  of  Chicago;  University  of  Georgia;  and  Virginia  Polytechnic 
Institute;  and  nineteen  army  facilities. 

Colonel  Paul  R.  Cerar,  Commanding  Officer  of  Edgewood  Arsenal, 
gave  the  Welcoming  Remarks  for  the  host  installation.  In  his  talk, 
he  gave  many  interesting  and  historical  facts  about  Edgewood  Arsenal. 

His  address  is  published  here  for  the  edification  of  those  who  were 
not  able  to  hear  him  speak. 

Formulation  of  the  outstanding  features  of  this  conference  and 
the  selection  of  the  invited  speakers  were  made  by  the  members  of  the 
Program  Committee  (Joseph  Cameron,  Francis  Dressel,  Walter  D.  Foster, 
Fred  Frishman,  Boyd  Harshbarger,  William  Kruskal,  H.  L.  Lucas,  Jr., 
Clifforn  Maloney,  Joseph  Mandelson,  Henry  Mann,  Raymond  B.  Schnell, 
and  Herbert  Solomon) .  The  Chairman  wishes  these  individuals  to  know 
that  he  appreciated  their  assistance  and  valued  their  comments  on  the 
various  phases  of  the  program. 

Frank  E.  Grubbs 
Conference  Chairman 


ill 


WELCOME* 


Colonel  Paul  k.  uerar 
Commanding  Officer,  Edgewood  Arsenal 


General  Bunker,  distinguished  guests  and  speakers,  ladles  and 
gentlemen  . 

Edgewood  Arsenal  is  proud  and  gratified  to  have  been  chosen  to  act 
as  your  host  for  this,  the  Fourteenth  Annual  Conference  on  the  Design  of 
Experiments  in  Army  Research,  Development,  Testing  and  Evaluation.  I 
consider  it  a  privilege  to  welcome  you  on  behalf  of  the  arsenal  and  its 
personnel.  It  is  particularly  fitting  that  our  arsenal  should  be  given 
this  opportunity  as  part  of  its  scientific  program  for  this  year  of  our 
existence,  a  half  century  of  work  and  achievement  as  a  significant  element 
in  the  defense  structure  of  our  country. 

In  October  of  1917  the  War  Department  acquired  this  reservation,  later 
to  become  the  Infant  Gas  warfare  Service's  first  home,  and  in  May  1918  named 
the  installation  Edgewood  Arsenal.  During  the  lean  years  between  world 
wars  Edgewood  Arsenal  struggled  to  prepare  the  military  arm,  offensively  and 
defensively,  in  the  area  of  chemical  warfare.  Despite  the  meager  resources 
allotted,  especially  during  the  depression  years,  somehow  the  Installation 
survived  to  provide  the  basic  cadre  for  the  enormous  expansion  to  over  7000 
military  and  8000  civilian  personnel  in  the  peak  years  of  World  War  II. 

Through  their  devoted  efforts,  our  military  forces  were  provided  with  a 
capability  in  research,  development,  procurement  and  supply  of  chemical 
offensive  and  defensive  materiel. 

Existing  industrial  and  manufacturing  facilities  were  rehabilitated  and 
new  ones  built.  Necessary  support  facilities  such  as  utilities,  an  airstrip, 
and  an  expanded  rail  network  were  added.  The  chemical  warfare  school  was 
expanded  and  a  modern  laboratory  complex  was  built  to  house  consolidated 
research  and  development  activities.  In  May  1942  the  installation  was  re¬ 
designated  the  Chemical  Warfare  Center.  In  August  1946  the  name  was  changed 
to  Army  Chemical  Center  but  in  1963  we  reverted  to  the  original  title: 

Edgewood  Arsenal. 

In  a  re-organization  approved  7  July  1966,  Edgewood  Arsenal  was 
designated  the  U.S.  Army's  Chemical  Commodity  Center  with  responsibility  for 
all  chemical  weapons  and  defense  materiel  research  and  development,  subordinate  to 
U.S.  Army  Munitions  Command.  Its  previous  administrative  control  over 
Fort  Detrick  was  relinquished  and  Fort  Detrick  became  a  separate  commodity 
center  with  responsibility  for  biological  weapons  and  defense  research  and 
development.  However,  because  certain  of  our  responsibilities  overlap  those 
of  Fort  Detrick  the  old  cooperation  between  the  two  installations  is  still  in 
existence  both  by  necessity  and  choice. 


*Colonel  Cerar  gave  the  Welcoming  Remarks  at  the  start  of  the  Conference  and 
also  served  as  Chairman  of  General  Session  1. 
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Two  sub-posts  fall  under  the  command  jurisdiction  of  the  Edgewood 
Arsenal  Commander:  Pin'  Bluff  Arsenal,  Arkansas  and  Rocky  Mountain 
Arsenal,  Colorado.  These  two  arsenals  are  engaged  in  various  aspects 
of  procurement  manufacture  and  testing  of  chemical  materiel. 

Over  the  years,  then,  Edgewood  Arsenal  has  grown  to  represent  about 
$115  million  in  fixed  investments,  to  include  $9.6  million  in  land  and 
improvements;  $78.1  million  in  buildings  and  facilities;  and  $27  million 
in  machinery  and  equipment.  These  figures  do  not  include  our  sub-posts. 

The  installation  employs  over  3,800  civilians  and  over  1,600  military 
personnel  with  a  combined  gross  payroll  of  some  $40  million. 

Among  our  civilian  employees  more  than  900  hold  bachelor  degrees; 
over  190  have  master  degrees;  and  75  have  attained  their  doctorates. 

In  connection  with  the  subject  which  is  basic  to  the  purpose  of  this 
conference  -  statistics  as  it  is  employed  in  research,  development, 
testing  and  evaluation  -  Edgewood  Arsenal  can  point  to  a  long,  and  a 
still  growing  interest  and  participation  in  this  highly  specialised 
field.  Starting  about  1942,  statistics  of  this  type  began  to  be  used 
in  preparing  specification  requirements  and  later  in  the  development 
of  certain  theoretical  concepts  upon  which  our  surveillance  and  other 
quality  assurance  activities  are  based.  Much  of  this  work  found  its 
way  into  the  literature  and  our  personnel  were  actively  engaged  in  the 
development  of  important  sampling  standards.  Interest  in,  and  utiliza¬ 
tion  of  statistics,  soon  spread  from  our  quality  assurance  elements  to 
our  research,  development  and  testing  activities.  At  a  later  date,  an 
Operations  Research  Group  was  formed  in  whose  work,  as  you  know,  statistical 
principles  play  a  major  role.  This  group  was  recently  incorporated  into 
the  U.  S,  Army  Munitions  Command  but  it  remains  physically  located  on 
this  post. 

The  Chemical  Corps  Engineering  Command  sponsored  several  conferences 
on  Statistical  Engineering  in  the  1950's  which  some  of  you  may  have 
attended.  It  has  been  our  policy  to  encourage  our  personnel  to  take 
an  active  part  in  all  professional  activities  -  delivering  and  pub¬ 
lishing  technical  papers  and  acting  as  chairmen  and  moderators  of 
technical  sessions. 

Our  background  dates  back  some  26  years,  when,  as  you  may  recall, 
the  work  of  Professors  Fisher  and  Pearson  in  England  on  the  Design  of 
Experiments  and  even  the  work  of  Shewhart,  Dodge,  and  Romig  in  this 
country  in  Statistical  Quality  Control  were  practically  unknown.  You 
can  see  why  Edgewood  Arsenal  feels  so  proud  to  act  as  your  hosts  for 
the  next  three  days. 

At  this  point,  I  am  pleased  to  acknowledge  our  indebtedness  to  the 
Army  Research  Office  and  to  its  arrangements  committee  for  inviting  us 
to  host  this  conference  and  to  extend  my  thanks  through  Dr.  Francis 
Dressel,  the  Secretary,  to  this  committee  for  the  excellent  work  they  have 
done  in  securing  such  outstanding  speakers  and  in  arranging  so  interesting 
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a  technical  program.  We  are  especially  honored  and  pleased  to  have  as 
our  keynote  speaker,  a  distinguished  soldier  who  has  taken  a  very  keen 
and  active  interest  in  the  subject  to  be  discussed. 

Lieutenant  General  William  B.  Bunker  is  a  graduate  of  the  United 
States  Military  Academy,  Class  of  '34.  He  attended  the  Massachusetts 
Institute  of  Technology  receiving  his  degree  of  Master  of  Science  in 
Engineering.  During  World  War  II,  General  Bunker  served  as  Deputy  in 
Charge  of  the  Transportation  Corps'  Supply  Program  and,  in  1945,  as  7th 
Army  Transportation  Officer,  during  the  occupation  of  Germany. 

When  the  Berlin  Airlift  began  in  1948  the  General  was  put  in  charge 
of  Terminal  Operations  governing  gathering  of  shipments,  loading  in  the 
United  States  zone,  unloading  and  distributing  cargo  in  Berlin.  He 
organized  a  similar  system  between  Korea  and  Japan  when  hostilities 
erupted  in  1950. 

In  1950  the  Chief  of  Transportation  named  General  Bunker  to  be  Chief, 
Air  Transport  Division,  investigating  the  application  of  the  helicopter 
to  Army  transportation.  The  result  of  this  investigation  was  an  immediate 
large  scale  expansion  of  this  activity.  General  Bunker  was  appointed 
Commandant  of  the  U.  S.  Army  Transportation  School  in  1954  and  the 
following  year  was  assigned  as  Commander,  U.  S.  Army  Transportation 
Materiel  Command,  responsible  for  logistic  support  of  Army  aviation. 

He  was  promoted  to  Major  General  1  June  1961. 

In  February  1962  he  became  a  member  of  the  planning  group  which 
developed  the  organization  for  the  Army  Materiel  Command  and  in  June 
was  assigned  as  its  Comptroller  and  Director  of  Programs.  On  1  April 
1962  he  became  Deputy  Commanding  General,  U.  S.  Army  Materiel  Command 
and  was  thereupon  promoted  to  Lieutenant  General  on  9  May  1966. 

General  Bunker  has  been  the  recipient  of  many  decorations  for  his 
outstanding  work  in  a  long  and  honorable  career,  not  only  from  his  own 
grateful  country  but  also  from  the  United  Kingdom  and  Nicaragua. 

He  is  a  member  of  Professional  Societies  and  published  various 
articles  in  technical  journals,  and  has  developed  a  keen  interest  in 
the  use  of  statistics  in  Army  Research,  Development,  Testing  and 
Evaluation. 

It  Is  with  great  pleasure  that  I  Introduce  our  keynote  speaker, 
Lieutenant  General  William  B.  Bunker. 

The  title  of  his  address  is:  "Broadening  the  Horizons  of  Experimental 
Design." 

. Thank  you,  General  Bunker  for  your  very  interesting  and 

informative  address. 
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One  of  the  most  important  objectives  of  these  conferences  has  been 
to  afford  the  conferees  an  opportunity  to  explore  with  authnr-f  t-ieo 
the  field  Lhoae  aspects  ot  the  subject,  matter  which  had  most  recently 
received  major  attention  and  development.  When  such  areas  have  been 
determined,  it  has  become  the  practice  to  invite  experts  in  these 
various  areas  to  speak  on  the  topics  selected. 

Our  next  speaker  is  Professor  Rolf  Erwin  Bargmann  of  the  University 
of  Georgia  and  the  Thomas  J.  Watson  Research  Center  of  IBM.  He 
has  had  a  varied  career,  having  been  a  Rockefeller  Foundation  Fellow 
prior  to  taking  his  Doctorate  in  Mathematical  Statistics  at  the  University 
of  North  Carolina.  He  was  associat  d  with  our  State  Department  in 
Germany  and  served  as  an  interpreter  during  the  Nuremberg  Trials.  He 
was  Assistant  Professor  of  Statistics  and  Head  of  the  Department  at 
Frankfurt,  later  Associate  Professor  of  Statistics  at  Virginia  Polytechnic 
Institute.  He  achieved  full  professorship  in  1959.  He  was  a  consultant 
to  White  Sands  Proving  Ground  in  the  summers  of  1957  and  1959.  He  is 
a  Fellow  of  the  American  Association  for  the  Advancement  of  Science  and 
a  member  of  several  statistical  societies. 

It  gives  me  great  pleasure  to  present  Professor  Bargmann,  who  will 
speak  on,  "The  Structure  and  Classification  of  Patterns." 
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BROADENING  THE  HORIZONS  OF  EXPERIMENTAL  DESIGN 


LT  General  William  B.  Bunker,  Deceased 
U.fi.  Army  Mo reriel  Command 
Washington,  D.C. 


From  its  early  beginnings,  statistics  has  been  an  important  vehicle 
with  which  reasonable  men  have  attempted  to  seek  an  understanding  of  the 
problems  which  confront  them.  Some  of  the  earliest  developments  and 
applications  of  statistical  concepts  occurred  in  response  to  problems  at 
the  gaming  tables.  In  fact,  I  have  been  told  that  more  than  one  early 
statistician  earned  his  keep  by  calculating  odds  for  a  wealthy  gambler. 

The  basic  orientation  of  statistics  toward  the  solution  of  practical 
problems  can  be  found  as  the  motivation  for  many  major  developments  in 
statistics.  For  example,  Thomas  Bayes  in  his  often  quoted  and  contro¬ 
versial  essay  stressed  his  desire  to  provide  a  more  efficient  procedure 
for  the  estimation  of  probabilities.  More  recently,  the  contributions 
of  Professor  R.A.  Fisher  in  the  area  of  small  sample  statistics  were 
motivated  by  a  desire  to  improve  the  analytic  tools  available  in  bio¬ 
medical  research. 

The  essential  point  is  that  many  of  the  important  developments  in 
statistics  were  motivated  by  a  desire  to  solve  real  world  problems.  I 
am  concerned  that  in  some  quarters  this  orientation  to  problem-solving 
has  been  replaced  with  a  tendency  toward  self  contemplation  and  a  primary 
interest  in  statistical  purity,  There  is  a  need  to  re-examine  the  direc¬ 
tion  of  current  efforts  and  to  confront  our  major  problems  head-on.  Only 
through  broadening  the  horizons  of  experimental  design  can  we  hope  to  deal 
effectively  with  our  most  pressing  problems. 

Today,  as  a  first  step  toward  broadening  the  horizon,  I  would  like 
to  spend  the  remainder  of  my  time  discussing  several  areas  that  are  amenable 
to  the  application  of  the  concepts  of  experimental  statistics. 

SYSTEM  TESTING  AND  DEVELOPMENT.  One  important  area  in  which  much 
work  is  needed  involves  the  statistical  issues  in  equipment  testing,  At 
the  offset,  I  want  to  stress  that  our  test  programs  are  not  and  in  fact 
cannot  be  scientific  experiments.  One  reason  for  this  is  that  the  tradi¬ 
tional  requirements  for  the  design  of  experiments  are  infeasible  within 
the  context  of  a  test  and  development  program.  For  example,  a  basic 
principle  of  design  of  experiments  involves  the  control  or  minimization  of 
the  variation  in  the  experimental  situation.  This  is  an  almost  impossible 
requirement  to  satisfy  for  two  reasons.  First,  due  to  modification  in  the 
system  during  development,  the  basic  heterogeneity  of  experimental  units  is 
high.  This  inherent  variability  represents  a  violation  of  a  baaic  statis¬ 
tical  assumption,  Second,  the  dimensions  of  the  problem  frequently  preclude 
control  or  even  measurement  of  extraneous  sources  of  variation,  Tha  problam 
was  illustrated  in  the  tost  program  for  our  new  AAFSS. 

The  status  of  a  scientific  experiment  also  is  denied  to  our  development 
and  test  programs  because  of  the  fact  that  we  just  can't  afford  the  large 
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number  o£  daca  points  that  are  required  in  a  classical  experimental  design. 

In  practice,  testing  is  done  on  a  small  number  of  prototype  systems.  If 
an  attempt  was  made  to  gather  the  number  of  observations  rpqulred  to  achieve 
the  desired  level  of  statistical  significances,  no  development  would  ever 
take  place. 

The  statistical  aspects  of  testing  programs  are  further  compounded  by 
our  difficulties  in  specification  of  the  model.  In  many  of  our  test  programs 
it  is  difficult  to  begin  to  select  the  relevant  variables  and  logically 
impossible  to  identify  the  important  interactions  and  nonlinearities. 

Our  recent  experience  with  the  development  of  152  ammunition  for  the 
Sheridan  provides  a  case  in  point.  The  variable  of  interest  In  this  case 
is  binominal,  either  the  round  fires  or  it  does  not  fire.  We  know  that 
reliability  of  this  ammunition  is  a  function  of  a  number  of  variables 
including  quality  control,  the  efficiency  scavenger  system,  the  ammunition 
case,  and  the  storage  environment,  but  we  also  realize  that  there  are  n  other 
important  dimensions  of  the  problem  which  remain  to  be  Identified.  For 
example,  through  observation  we  have  established  an  interaction  between  the 
degree  of  moisture  in  the  powder  and  the  quantity  of  residue.  Experience 
has  demonstrated  that  higher  moisture  content  resulted  in  more  residue.  In 
response  to  this  finding  we  have  lowered  the  moisture  content,  but  this 
change  raises  a  question  concerning  other  yet  unknown  interactions  that 
are  at  work  in  determining  the  reliability  of  the  ammunition. 

Changing  the  moisture  content  also  illustrates  another  problem  that 
pervades  the  testing  programs.  When  the  nature  of  an  item  is  altered  as 
a  matter  of  course  in  testing  and  development,  how  does  one  aggregate  the 
test  data  that  were  generated  prior  to  the  change  with  that  data  which  have 
been  gathered  after  the  change?  In  a  strict  sense,  the  modification  has 
changed  the  basic  structure  of  the  situation  that  is  being  modeled,  and 
has  made  the  two  sets  of  data  incommensurable.  In  reality,  we  are  measuring 
a  series  of  separate  probability  curves  and  are  reporting  the  envelope  of 
these  curves.  This  Is  analogous  to  developing  a  baseball  batting  average 
by  combining  performance  in  the  preliminary  grapefruit  league  with  that  in 
standard  league  play.  In  both  cases,  the  cumulative  measure  of  performance 
combines  early  and  tentative  results  with  those  that  have  been  obtained 
after  the  system  has  been  brought  up  to  working  order.  The  net  effect  of 
this  procedure  is  to  substantially  understate  the  reliability  of  the  system. 

Given  this  situation,  how  can  we  give  our  customer  a  valid  statement 
of  quality  assurance?  Upon  examining  the  results  of  the  testing  program, 
the  statistician  would  say  that  we  have  a  ratio  of  approximately  1  to  52,000, 
but  what  we  really  need  to  satisfy  the  customer  is  a  ratio  of  1  -  1,000,000. 
At  this  point  I  can  say,  qualitatively,  that  the  real  reliability  of  the 
system  is  understated;  however,  it  is  impossible  to  specify  the  absolute 
magnitude  of  the  error.  Naturally,  the  customer  is  not  satisfied  with 
the  statement  about  reliability  of  the  ammunition,  and  something  must  be 
done  to  improve  the  situation.  The  statisticians'  answer  to  this  dilemma 
is  more  testing  tc  develop  the  required  observations.  This  is  an  extremely 
costly  procedure  and  it  would  have  been  better  to  have  done  more  work  on 
estimating  the  initial  function.  Ad  hoc  testing  at  this  juncture  is  not  a 
feasible  solution  to  the  problem. 
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An  alternacive  approach  can  be  found  in  the  area  of  statistical 
decision  theory.  Resolution  of  this  dilemma  may  be  achieved  through  the 
combination  of  the  subjective  judgment  of  the  experts  and  objective 
experimental  results. 

A  second  area  in  our  testing  program  that  requires  attention  Involves 
the  development  of  large,  expensive  systems.  The  Main  Battle  Tank  provides 
a  good  illustration  of  the  problem.  We  really  have  only  a  vision  of  the 
MBT.  In  this  situation,  the  problem  is  that  there  is  no  real  testing  of 
the  whole  system.  Instead,  tests  are  conducted  on  different  vehicles  with 
various  configurations.  This  means  that  most  of  the  parameters  of  Interest 
vary  from  test  to  test  and  that  very  little  remains  constant  among  the 
tests.  What  we  are  attempting  to  model  then  is  really  a  function  of 
functions.  Casual  factors  can  no  longer  be  expressed  simply  as  numeric 
values  but  themselves  must  be  represented  as  functions,  the  values  of 
which  are  in  turn  dependent  upon  the  value  of  the  total  function. 

One  analytic  technique  that  has  been  utilized  to  attempt  to  model 
a  function  of  functions  is  dynamic  programming.  In  the  development  of 
the  basic  algorithm  Bellman  used  a  recursive  scheme  to  reflect  the  method 
of  sequential  calculation  that  is  the  essence  of  the  approach.  For  example, 
consider  an  aerial  weapon  system  consisting  of  a  navigational  subsystem, 
a  target  acquisition  subsystem,  and  a  weapon  subsystem.  It  is  desired  to 
determine  the  optimal  characteristics  of  all  three  subsystems,  but  all 
these  decisions  are  Interdependent.  The  thing  we  do  know  is  that  whatever 
navigational  and  target  acquisition  subsystems  are  chosen,  the  characterise 
tics  of  the  weapon  system,  e.g.,  the  rate  of  fire  must  be  optimal  with 
respect  to  the  effectiveness  of  the  whole  system.  Using  the  principle  of 
optimality  proposed  by  Bellman,  we  can  say  that  the  optimum  rate  of  fire 
is  a  function  of  the  effectiveness  of  the  aerial  weapons  system.  Since 
we  do  not  know  the  optimal  characteristics  for  the  other  two  subsystems, 
the  optimal  rate  of  fire  and  total  system  effectiveness  must  be  found  for 
all  feasible  outputs  of  the  subsystem.  This  technique  may  provide  a  clue 
regarding  the  way  to  handle  complex  equations  without  knowing  their  specific 
form. 


The  essential  point  is  that  we  must  move  away  from  concepts  that 
require  the  testing  of  a  static  system.  Pressures  imposed  by  necessary 
modifications  of  systems  in  the  development  process  do  not  allow  all  other 
things  to  remain  equal  and  this  dynamic  aspect  of  the  environment  cannot 
be  ignored. 

On  balance,  it  appears  that  increased  emphasis  on  rigor  in  the  design 
of  experiments  has  diverted  our  attention  from  the  ultimate  objectives. 
Efforts  must  be  undertaken  to  develop  techniques  which  provide  feasible 
solutions  to  problems  of  quality  assurance  and  the  manipulation  of  more 
complex  dynamic  models.  We  need  to  soften  the  science  of  experimental 
design  to  make  it  a  more  useful  tool  in  test  and  development  programs. 

The  alternative  to  this  change  is  to  continue  to  strive  for  more  tech¬ 
nically  precise  answers  which  are  even  less  meaningful  in  the  decision 
making  process.  Unless  a  conscious  effort  is  made  to  avoid  this  plight, 
experimental  statistics  may  create  a  paradox  similar  to  that  caused  by 


3 


managerial  accounting.  As  a  tool  of  management,  the  discipline  of 
accounting  has  experienced  an  increase  in  the  precision  with  which 
financial  Information  is  analyzed  and  reported,  but  it  still  does  not 
provide  much  assistance  in  the  decision  making  process.  Decision  makers 
can  safely  rely  on  accounting  to  identify  the  loss  after  the  investment 
has  failed,  but  it  is  of  no  help  in  forecasting  the  likelihood  of  this 
occurrence.  It  is  an  after  the  fact  discipline,  and  our  requirements 
are  for  knowledge  before  the  fact. 

While  reflecting  on  these  challenges  that  lie  ahead,  it  may  be  use¬ 
ful  to  reconsider  the  role  of  statistical  analysis  in  the  decision  making 
process.  The  decision  maker  is  concerned  with  choosing  between  two  or 
more  alternatives;  the  value  of  which  remains  to  be  established  by  events 
in  the  future.  Statistical  analysis  is  valuable  only  to  the  extent  to 
which  it  raises  the  level  of  understanding  of  the  problem  and  in  so  doing 
provides  an  improved  basis  for  fixing  beliefs  about  the  future,  In 
contrast,  analyses  that  provide  Interesting  expositions,  but  no  additional 
understanding,  are  of  little  value.  It,  therefore,  is  essential  for  the 
analyst  to  be  attuned  to  informational  requirements  of  the  decision  maker 
if  real  progress  is  to  be  made. 

MANAGEMENT  INFORMATION  SYSTEMS.  A  second  area  which  could  benefit 
from  the  attention  of  statisticians  is  the  design  of  management  information 
systems.  Even  a  cursory  examination  of  the  recent  attempts  to  design  and 
implement  management  information  systems  reveals  the  opportunity  for 
substantial  improvement  through  the  infusion  of  the  concepts  of  experimental 
statistics.  Many  of  these  efforts  reflect  a  lack  of  understanding  of  the 
available  techniques  for  summarizing  and  annalyzing  data.  The  result  of 
this  naivete  has  been  inefficiency  in  syBtem  design  and  confusion  regarding 
the  purpose  and  value  of  the  output  of  the  system.  For  example,  the 
operation  readiness  of  our  hawk  units  throughout  the  world  must  be  monitored 
daily  by  phone.  Since  this  information  is  vital  to  decision  makers  at  the 
highest  levels,  one  would  have  hoped  that  a  less  cumbersome  communication 
system  could  have  been  planned. 

To  provide  you  with  more  background  on  the  problem  area,  It  may  be 
useful  to  examine  briefly  the  origin  of  our  current  dilemma.  The  root 
of  the  problem  can  be  found  in  our  recently  acquired  capacity  to  process 
and  transmit  rapidly  information.  In  the  last  thirty  years  technological 
progress  has  resulted  in  the  development  of  three  generations  of  computers; 
each  of  which  represented  a  dramatic  improvement  over  the  current  state-of- 
the-art.  Equipped  with  the  exciting  abilities  to  process  in  a  real  time 
mode  and  to  directly  access  data  banks,  the  designers  of  these  systems 
have  moved  in  the  direction  of  including  everything  about  everything  in 
the  system. 

One  example  of  the  problem  is  provided  by  the  periodic  Army  readiness 
report  that  is  prepared  for  the  Chief-of-Staff .  Included  in  this  report, 
iu  great  detail,  is  information  on  not  only  major  items  such  as  tanks  and 
jeeps  but  also  on  many  minor  items  as  well.  Once  attention  was  drawn  to 
equipment  readiness  at  this  level  of  specificity  it  became  apparent  that 
the  number  and  status  of  most  of  the  items  were  subject  to  continual  change. 
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This  meant  that  the  job  of  preparing  a  large  scale  report-  *.?22  further 
''<rr.pcur.dcd  by  the  fact  cnat  the  information  had  to  be  updated  and  publi‘:.;ed 
frequently,  if  it  was  to  be  of  value  in  its  current  form.  A  question  can 
be  asked  as  whether  or  not  this  is  a  worthwhile  or  even  feasible  effort. 

This  same  point  should  be  raised  in  every  management  information  system. 

In  nearly  all  phases  of  our  business  today  one  can  observe  information 
being  translated  into  electronic  impulses  for  transmission  up  to  higher 
levels  of  authority.  It  is  important  to  note  that  once  data  is  separated 
from  its  traditional  hard  copy  vehicle,  e,g,,  the  DA  Form;  it  can  be  sorted, 
summarized,  or  transmitted  at  almost  unbelievable  speeds.  It  is  this 
speed  and  the  low  per  unit  coat  of  processing  information  which  have  caused 
many  of  the  current  problems  with  management  information  systems. 

These  rapid  changes  in  communications  technology  have  caused  some 
rather  traumatic  experiences  in  most  large  organizations.  To  begin  with, 
many  management  theorists  and  most  managers  of  today  are  still  thinking 
in  terms  of  the  traditional  forms  of  organization  structure.  These 
concepts  generally  Involve  pyramidal  configurations  of  the  different 
layers  of  authority.  The  problem  is  that  these  organizations  reflect  a 
certain  state  of  information  processing  technology  and  this  level  of 
technology  is  rapidly  becoming  obsolete.  There  is  no  doubt  that  a  certain 
disparity  has  always  existed  between  the  institutionalized  organization 
structure  and  information  technology;  however,  recent  innovations  have 
aggravated  and  accentuated  the  problem.  It  is  useful  to  examine  the 
factors  that  are  important  to  this  problem  in  order  to  better  evaluate 
alternative  solutions. 

One  important  factor  is  the  heterogeneity  in  the  speed  with  which 
different  types  of  information  are  processed  through  the  organization. 

While  it  is  not  possible  to  rapidly  analyse  and  summarize  information  on 
personnel  strength  through  the  organization,  it  is  still  necessary  to 
individually  monitor  the  progress  of  many  R&D  programs.  So  within  the 
same  large  organization,  new  information  processing  techniques  have 
dramatically  affected  the  form  and  function  of  some  activities  while 
others  remain  essentially  unchanged.  This  phenomenon  has  made  the 
traditional  concepts  of  a  centralized  and  decentralized  organization 
obsolete  in  that  both  tendencies  are  apparent  within  many  phases  of  our 
business. 

The  increasing  magnitude  of  the  upward  flow  of  information  also 
serves  to  exacerbate  the  disparity  between  information  processing  tech¬ 
nology  and  organizational  structure.  Too  frequently,  our  concept  of 
the  informational  requirements  that  must  be  transmitted  up  to  top 
management  reflects  a  lack  of  appreciation  for  the  objectives  of  the 
system.  Most  communication  that  an  individual  has  with  the  higher 
levels  of  the  organization  is  through  his  immediate  superior.  Communica¬ 
tion  at  this  level  is  intimate  and  detailed  and  this  is  as  it  should  be 
between  superior  and  subordinate.  This  is  not,  however,  the  appropriate 
level  of  communication  between  a  first  line  supervisor  and  top  management. 
The  top  level  manager  has  neither  the  need  to  know  nor  the  capability 
to  assimilate  the  large  volumes  of  specific  information;  and,  therefore, 
it  makes  little  sense  to  send  information  at  this  level  of  detail  up  through 
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the  information  system. 

In  addition  to  being  illogical,  this  tendency  has  serious  implications 
for  the  organization  and  the  decision  maker.  If  the  trend  continues, 
middle  management  will  of  necessity  be  relegated,  in  large  measure,  to 
the  job  of  expediting  the  flow  of  Information  up  the  line  of  authority. 

More  important,  however,  is  the  effect  of  this  tendency  on  the  performance 
of  the  decision  maker.  From  his  point  of  view,  this  tremendous  flow  of 
information  provides  an  all  encompassing  yet  fragmentary  view  of  reality. 
While  the  decision  maker  has  easy  access  to  information  regarding  every 
significant  dimension  of  the  problem  and  some  trivial  ones  as  well;  he 
may  still  find  himself  in  a  quandary  over  the  nature  of  the  situation.  The 
reality  of  any  situation  is  extremely  complex  when  viewed  in  its  entirety. 
Most  of  us  have  learned  through  experience  in  situations  to  suppress  those 
aspects  of  reality  which  are  superfluous  to  the  problem  at  hand;  however, 
the  ability  to  do  this  effectively  depends  on  an  intimate  understanding  of 
the  particular  problem  and  environment.  This  point  illustrates  a  major 
impetus  for  specialization  of  interest  and  talent  but  raises  a  serious 
question  concerning  the  relationship  between  the  top  level  decision  maker 
and  the  information  system.  It  is  obvious  that  no  top  manager,  regardless 
of  his  ability,  can  begin  to  accumulate  experience  comparable  to  the  new 
sum  of  that  possessed  by  the  specialists  in  his  organization.  It  should 
be  equally  obvious  that  the  detail  and  format  of  information  required  by 
the  manager  is  markedly  different  from  that  which  is  required  in  the  lower 
echelons,  This  la,  however,  only  half  the  problem. 

The  sorting  and  evaluating  of  information  by  the  decision  maker  is 
further  complicated  by  the  fact  that  the  information  has  been  abstracted 
from  the  environment  to  which  it  is  Indigenous.  Mo  longer  is  it  possible 
to  view  the  situation  in  its  totality  or  to  make  inferences  from  the 
juxtaposition  of  the  various  elements.  The  information  is  now  presented 
in  a  homogenous  package  and  there  is  little  effort  made  to  illustrate  the 
relative  Importance  of  the  various  bits  of  information.  This  format 
encourages  the  tendency  to  limit  the  analysis  to  what  are  apparently 
obvious  relationships  in  the  data,  and  all  too  often,  these  obvious 
relationships  depict  only  a  superficial  view  of  the  problem.  When  con¬ 
fronted  with  such  a  situation,  the  decision  maker  is  tempted  to  feel  that 
his  evaluation  is  profound  when  it  in  fact  may  be  obvious  and  trivial  or 
even  worse  incorrect. 

The  question  then  arises  as  to  what  alternatives  are  available  to 
aid  us  in  resolving  this  dilemma.  One  answer  to  the  problem  may  be  found 
in  the  imaginative  and  effective  application  of  the  techniques  of  statis¬ 
tical  analysis.  Concepts  and  procedures  that  have  been  used  successfully 
for  years  by  statisticians  offer  the  means  by  which  meaningful  order  can 
be  restored  in  our  information  systems.  , 

Returning  to  the  example  of  the  Army  readiness  reports,  in  this 
information  system  the  emphasis  has  been  placed  on  reporting  the  status 
of  practically  every  item  in  the  inventory.  A  moment's  reflection  reveals 
that  this  approach  is  a  violation  of  the  principle  of  parsimony.  Why  is 
it  necessary  to  report  data  on  the  status  of  every  item,  when  we  are  really 
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only  in  those  items  in  n  particular  atatua?  li.  la  encouraging 

to  note  that  all  information  systems  have  not  proceeded  down  the  same 
path.  The  New  York  City  Department  of  Public  Health,  for  example,  does 
not  attempt  to  measure  the  health  status  of  the  city  by  directly  estimating 
the  proportion  of  the  total  population  who  are  well.  Instead,  their 
attention  is  focused  only  on  those  who  are  sick.  Their  approach  is  to 
monitor  the  population  of  the  hospitals  throughout  the  city.  Through 
observation  of  this  one  accessible  indicator,  they  are  able  to  maintain 
an  adequate  estimate  of  the  general  level  of  health  of  the  community. 

The  principle  is  to  replace  the  real  variable  of  interest  with 
surrogate  which  is  more  easily  measured  and  analyzed.  This  has  been  a 
relatively  common  practice  among  statisticians  and  it  should  have  applica¬ 
tion  in  the  design  of  our  information  systems.  In  the  case  of  the  readiness 
report,  a  substantial  increase  in  the  value  of  the  effort  would  be  realized 
by  reporting  exceptions  rather  than  the  status  of  the  whole  system.  This 
scheme  would  substantially  reduce  the  upward  flow  of  information  and  focus 
attention  on  the  real  variable  of  Interest.  In  another  phase  of  the  opera¬ 
tion,  perhaps  the  status  of  a  particular  maintenance  operation  could  be 
gauged  more  efficiently  and  accurately  through  the  examination  of  the 
re-enlistment  rates  rather  than  the  number  of  items  serviced  per  month. 

The  kind  of  changes  suggested  would  not  only  reduce  the  upward  flow  of 
information  but  also  place  the  information  in  a  form  and  format  that  is 
more  useful  in  the  decision  making  process. 

The  concepts  of  sampling  offer  yet  another  statistical  tool  that 
appears  to  have  application  in  the  design  of  information  systems.  Even 
if  modern  technology  can  provide  us  with  the  machine  capability  to  process 
Information  at  very 'high  speeds,  this  capability  has  a  significant,  positive 
cost.  It  is  therefore  necessary  to  examine  alternative  ways  to  economize 
in  the  operation.  Sampling  theory  provides  the  basic  notions  for  efficiently 
and  economically  gathering  data  about  a  particular  population  of  interest. 

For  example,  the  mean  cost  of  procuring  an  item  could  be  estimated  accurate¬ 
ly  and  at  a  mere  fraction  of  the  cost  of  total  enumeration  through  the  use 
of  a  self -weighting,  stratified  sample.  It  should  also  be  remembered  that 
in  many  cases,  sample  estimates  might  be  even  better  than  would  usually 
be  expected  because  our  concern  is  primarily  with  finite  populations. 

A  more  general  perspective  for  design  of  an  information  system  may  be 
gained  from  the  philosophy  of  analysis  that  pervades  among  statisticians. 
While  many  of  the  designers  of  information  systems  have  been  content  to 
concentrate  on  the  preparation  and  reporting  of  data,  the  interest  of  most 
statisticians  continues  through  analysis  and  interpretation,  Efforts  must 
be  made  to  bring  the  analysis  phase  into  the  design  of  a  system.  Up  to 
this  point  system  designers  have  emphasized  performance  measures  such  as 
speed  or  cost  per  calculation  as  measures  of  effectiveness,  but  we  hsve 
seen  that  this  approach  ignores  the  important  question  about  system  effective 
ness,  i.e.,  what  is  the  value  of  information?  Timeliness  of  information  is 
important;  however,  in  our  effort  to  obtain  more  current  data  we  have 
ignored  certain  other  important  aspects  of  the  problem.  Is  it  really 
worth  anything  to  the  organization  to  spend  additional  money  to  send 
information  more  quickly  if  much  of  the  information  in  the  system  is 
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already  redundant  or  vu.nuseable?  Does  it  make  sense  to  publish  figures 
in  a  daily  report  i*  it  will  require  several  weeks  worth  of  observation 
to  verify  whether  a  change  in  the  data  is  real  or  simply  an  aberration? 

The  answer  to  both  questions  is  obviously  no!  Both  queries  suggest  that, 
in  the  future,  major  payoffs  will  accrue  to  advances  in  the  analysis  of 
data  that  can  be  incorporated  within  the  system.  Further  analysis  will 
take  additional  time;  however,  it  should  also  substantially  increase  the 
informational  value  of  reports.  When  examining  this  tradeoff  it  is 
essential  to  remember  that  most  changes  that  take  place  within  a  large 
organization  are  gradual  and  occasionally  painfully  slow.  Given  this 
situation,  it  is  reasonable  to  expect  that  the  opportunity  cost  of  the 
time  lost  during  further  analysis  may  be  substantially  less  valuable  than 
the  increased  understanding  which  would  be  generated. 

In  summary,  there  is  a  genuine  need  to  apply  the  philosophy  of 
experimental  design  to  the  design  of  management  information  and  control 
systems.  Statistical  techniques  can  help  to  determine  which  variables 
should  be  measured  and  which  should  be  ignored,  as  well  as  facilitating 
the  analysis  and  forecasting  of  trends,  Up  to  now,  there  has  been  little 
feedback  between  those  interested  in  experimental  design  and  those  involved 
in  information  system  design.  Much  of  what  we  know  in  the  latter  area  has 
been  the  result  of  a  trial  and  error  process,  and  as  I  am  sure  you  are 
well  aware,  this  can  be  a  very  expensive  way  to  learn.  If  some  of  the 
statistical  notions  of  sampling  and  analysis  can  be  communicated  to 
system  designers,  then  substantial  payoffs  will  be  realized.  A  response 
in  this  direction  now  will  encourage  efficiency  and  progress.  If  no 
response  is  forthcoming,  however,  and  decision  making  continues  to  escalate, 
a  requirement  for  total  Information  reporting  will  demand  a  huge  organiza¬ 
tion  just  for  purposes  of  processing.  In  many  ways,  the  dilemma  of  the 
decision  maker  is  analogous  to  that  of  an  individual  who  attempts  to 
examine  the  behavior  of  a  particle  suspended  in  liquid.  The  more  the 
individual  stud leu  the  particle  the  more  confused  he  becomes  of  the  random 
effect  of  brownian  motion.  The  perception  of  both  the  hypothetical  individual 
and  the  decision  maker  could  be  improved  through  the  use  of  certain  basic 
statistical  notions, 

CONCLUSION,  As  we  have  seen  there  are  a  number  of  opportunities  to 
broaden  the  horizons  of  experimental  design  through  reduced  emphasis  on 
rigor  and  increased  attention  to  current  problems;  be  they  in  testing  or 
systems  design.  The  next  move  is  up  to  you. 
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THE  STRUCTURE  ANS  CLASSIFICATION  Or  t'AJ.ir.KNS* 

Rolf  E.  Bargmann 
University  of  Georgia 
Athens,  Georgia 


TERMINOLOGY.* ** 

Logical  Pattern;  A  set  of  p  diagnostic  events  is  observed. 

Occurrence  is  marked  by  1,  non-occurrence  by  0.  Such  single  observa¬ 
tion  results  in  a  row  of  0's  and  l's.  Observations  are  repeated,  and 
several  such  rows  constitute  a  pattern.  If  rows  are  dependent  (e.g., 
observation  at  consecutive  times) ,  a  cyclical  autocorrelation  dependence 
is  assumed. 

Major  Event:  One  or  very  f«/w  underlying  artificial  events,  each 
of  which  may  assume  two  or  more  Istates ,  which  influence  the  probability 
of  occurrence  of  each  diagnostic  event. 

Calibration  Pattern:  A  logical  pattern,  consisting  of  several  rows, 
containing  observation  of  occurrences  and  non-occurrences  of  all  diagnostic 
events  if  the  major  event  (or,  ratner,  some  physical  event  closely  related 
to  the  artifical  major  event)  is. in  a  known  state  (e.g.,  repeated  observa¬ 
tion  of  symptoms  of  a  patient  who  suffers  from  a  known  disease). 

Model  Assumption  (leading  to  a  variant  of  the  Latent  Class  Model): 

The  state  of  the  major  event  determines  the  probability  of  occurrence 
or  non-occurrence  of  each  diagnostic  event.  Except  for  this  influence, 
the  diagnostic  events  are  assumed  to  be  independent  (principle  of 
conditional  independence) . 

Sample  Pattern:  A  logical  pattern  consisting  of  one  or  more  rows, 
describing  a  situation  where  the  state  of  the  major  event  is  unknown. 

Its  distance  (Euclidean  distance  or,  better,  -2  log  likelihood)  from 
each  of  the  calibration  patterns  determines  the  proximity  of  the  current 
state  of  the  major  event  to  each  of  the  known  states  represented  by  the 
calibration  patterns. 

Note  that  extensive  calculations  are  required  on  calibration  patterns 
only.  Determination  of  the  distances  of  a  sample  pattern  from  each 
calibration  pattern  is  a  very  simple  matter,  and  can  even  be  done  by 
hand  calculation. 


*A  handout  at  the  conference  served  as  a  basis  for  this  paper. 

**Reference,  R.  E.  Bargmann,  "A  Method  of  Classification  Based  upon 
Dependent  0-1  Patterns,"  IBM  Research  Report  No.  RC-677,  April,  1962). 
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CAUt  TUP  CTUDTPP  PAPUTTT  A O  TV*j»  fcllCVrf.r.g  1 1 2  G  t  Kf» 

obtained  from  each  calibration  pattern: 

x  ■  0  or  1  —  the  entry  in  row  t  and  column  i 

of  the  calibration  pattern.  N  -  number  of  rows,  p  -  number  of  columns 


N 

Si  “  I  xti  Pf  “  Si/N  (column  averages) 


’ij 


N 

l 

t-1 


XtiXtJ 


Jlj 


Sij/N 


(average  number  of 
1-matches  in  columns 

i,j) 


If  rows  are  assumed  to  be  independent,  then 
°ii  " 

a  a  a  a 

°ij  “  [pij  -  PiPjl^1 

If  rows  are  assumed  to  be  time-dependent  (cyclical,  autocorrelation  of 
lag  1)  the  following  additional  quantities  are  needed 

N 

Ci  "  l  *tiXt+l,i  (xli  “  XN+l,i) 

t-1 

N 

Dij  “  I  xtiXt+l  j  (1-matches,  down) 

t-1 

N 

-  I  xt+i>ixtj  (1-matches,  up) 

t-1 

rA  -  (C^  -  S^/N)/(Si  -  S^/N)  (autocorrelation) 
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(If  N  is  even,  and  if  a  perfectly  alternating  sequence  occurs  in  a 
column — i.e.,  010101...  or  101010...,  r  should  be  replaced  by 
2/ (N+l)-l) . 

Then 


“ii 


Pl(l'Pl) 


N-l 


2ri 

(1  +  - ) 

1-f. 


(-  (N-2)/NJ  ,  if  a  column  consists  of  perfect 

zeros  or  ones) 


pirpipj 


N-l 


(l-r^U-r^)  N  (N-l) 


(l-r1)(l-rj)  N(N-l) 


Subject  the  matrix  [  (or  the  corresponding  correlation  matrix,  —most 
computer  programs  do  the  conversion  automatically)  to  a  Factor  Analysis. 
If  the  major  event  is  assumed  to  have  2  states,  extract  one  factor,  if 
k  +  1  states,  extract  k  factors.  A  crude  technique  (e.g..  Centroid)  or 
even  cruder  ones  (e.g.,  principal  components  which,  alas,  some  computer 
programs  call  "Factor  Analysis")  can  be  expected  to  yield  satisfactory 
results.  For  the  special  case  of  two  states  of  the  major  event,  a  single 
vector  f.  (elements  f. )  will  be  reported.  From  each  calibration  pattern, 

1  2 

the  weights  w^  »  1/(1  -  f^)  should  be  calculated. 

Now,  to  calculate  the  distance  (or  rather,  the  -2  log  likelihood 
quantity)  of  a  sample  pattern  from  calibration  pattern  q  obtain  the 
average  of  each  column  in  the  sample  pattern,  call  it  a^. 


Then 


log  0  +  log  [1  +  £  <»,-!)] 


1-1 


iiq 


♦  ! 


j-i 


jq 


-  ! 


i-l 


log  w 


iq 


♦  -  ! 


i-i 


(ai"Piq)  Wiq 


iiq 
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[P  (ai-Piq)flqWiq-]  P 

-  k  y.  - — — -  /  [l  +  l  (w,_-i)] 

L1"1  V  siiq  J  J-i  ^ 

where  k  -  number  of  rows  in  the  sample  pattern,  and  all  logs  are  to  base 
e.  The  last  subscript  q  indicates  that  the  corresponding  value  is  to  be 
taken  from  the  q'th  calibration  pattern. 

IMPLICATIONS  OF  MODEL  ASSUMPTIONS  ON  THE  STRUCTURE  OF  THE  COVARIANCE 
MATRIX :  If  the  major  event  has  only  two  states,  and  a  is  the  probability 
that  the  major  event  is  in  state  1 

v  2 

N  •  I  ■  (a-a  )  £  £.  +  diagonal 

where  the  vector  £  has  elements  (p^.^-p^g)  j  the  difference  between 

the  conditional  probabilities  of  occurrence  of  diagnostic  event  i,  given 
that  the  major  event  is  in  state  1  or  0. 

If  there  are  k  +  1  states  (or,  with  restrictions,  several  major 
events) ,  the  covariance  matrix  has  the  structure 

rv^i)  -“ia2  •••  -Vk 

V2  «2(1-C,2)  "a2\ 


al\  -°2“k  \(I 

where  ct^  denotes  the  probability  that  the  major  event  is  in  state  m, 

and  the  matrix  P  has  k  columns  (number  of  states  minus  1).  The  element 
in  row  i  and  column  m  is  (p^^-p^g) . 

These  are  standard  factor  analysis  models.  The  matrices  are  easily 
inverted,  and  the  determinant  is  easily  found  —  thus,  the  calculation 
of  distances  from  a  sample  pattern  to  each  of  the  calibration  patterns 
can  be  most  easily  effected. 

A  direct  evaluation  of  the  conditional  probabilities  can  be  made 
only  if  assumptions  can  be  made  relative  to  the  probabilities  that  the 
artificial  major  event  is  in  a  given  state.  Such  assumptions  are  some¬ 
what  tenuous,  Inasmuch  as  the  physical  major  event  is  not  identical 
(though  hopefully  highly  correlated)  with  the  artificial  major  event. 
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Sample  from  I  with  20%  error 
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Calibration  Pattern  1 


Sample  from  1  with  302  error 
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ABSTRACT.  A  2  factorial  experiment  was  conducted  to  determine 
the  effects  of  4  factors  In  a  single-point  tool,  turning  operation. 

Facte.;,  considered  were  A  (tool  material),  B  (cutting  fluid  type), 

C  (fluid  application  method),  and  D  (fluid  concentration).  Factor  B 
(cutting  fluid  type)  was  of  primary  interest  in  this  experiment. 

An  analysis  of  variance  was  performed  using  Yates'  technique 
to  test  significance  of  the  different  factors  and  interactions  and 
to  determine  the  relative  importance  of  these  different  effects. 

The  results  of  this  analysis  indicate  that  the  type  of  cutting 
fluid  is  a  relatively  unimportant  factor  compared  with  the  method  of 
application  and  the  concentration  of  the  cutting  fluid. 

INTRODUCTION.  Cutting  fluids  are  applied  to  various  metal  cutting 
tools  to  help  prevent  excessive  heat  buildup  and  to  reduce  friction  at 
the  tool-chip  interface.  A  number  of  beneficial  effects  can  be  obtained 
if  a  cutting  fluid  can  perform  these  functions.  Tool  life  can  be  extended; 
or,  higher  cutting  speeds  can  be  used  while  maintaining  the  same  tool  life; 
or,  some  combination  of  higher  speed  and  longer  tool  life  can  be  obtained. 
Tolerances  and  surface  finish  may  improve  or  be  easier  to  maintain  with  an 
effective  cutting  fluid. 

Various  users  and  manufacturers  of  cutting  fluids  have  developed 
formal  performance  tests  to  evaluate  and  compare  different  cutting  fluids, 
mainly  for  their  own  special  interests.  Unfortunately,  these  tests  have 
not  been  standardized;  no  specific  procedure  has  been  widely  accepted; 
and,  rarely,  is  any  formal  significance  test  made.  Also,  the  importance 
of  optimizing  the  cutting  fluid  is  not  usually  determined  relative  to  the 
importance  of  optimizing  other  parameters  such  as  tool  geometry  or  material. 
In  many  cases  elaborate  programs  are  set  up  for  cutting  fluid  selection; 
but,  in  the  same  shop  no  organized  effort  is  made  to  optimize  cutting  speeds 
and  feeds  or  any  of  the  other  parameters  affecting  the  machining  operation. 
In  fact,  experimental  design  and  statistical  analysis  have  been  notoriously 
lacking  in  the  whole  field  of  metal  cutting  research.  A  typical  comment 
overheard  in  a  conversation  between  some  colleagues  went  something  like 
this:  "Statistics  is  fine,  but  we  can't  run  that  many  tests  in  metal 
cutting."  The  idea  that  a  great  number  of  test  runs  is  necessary  to 
facilitate  statistical  analysis  is  complete  nonsense!  Experiments  can 
often  be  reduced  in  size  by  proper  design  and  consideration  of  the  analysis 
to  be  performed.  It  is  certainly  uneconomical  to  make  experiments  larger 
than  necessary. 
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A  2  FACTORIAL  EXPERIMENT .  As  an  illustration  oi  a  type  or  experi- 
mental  design  which  can  be  used  in  the  metal  cutting  field,  the  remainder 
of  this  paper  describes  a  2^  factorial  experiment  (4  factors  each  at  2 
levels).  This  experiment  was  conducted  to  determine  the  effects  of  four 
factors  in  a  single-point,  lathe,  turning  operation.  These  factors  were: 

Factor  A  -  Tool  material 
Factor  B  -  Cutting  fluid  type 
Factor  C  -  Method  of  fluid  application 
Factor  D  -  Fluid  concentration 

Each  factor  was  tested  at  two  levels,  thus,  making  an  experiment  of 
16  observations.  The  two  cutting  fluids  tested  were  Fluid  A  (a  heavy- 
duty,  chlorinated,  soluble  fluid)  and  Fluid  B  (a  fluid  specially  formulated 
for  mist  application).  Each  fluid  was  used  at  two  different  dilutions 
(20:1  and  35:1),  and  the  two  different  methods  of  application  were 
conventional  flood  and  mist. 

It  should  be  understood  that  an  experiment  of  16  observations  is 
certainly  a  small  experiment;  but,  it  could  be  readily  expanded  by  adding 
more  factors  and/or  using  more  than  2  levels.  The  mathematical  model  of 
this  experimental  design  was: 

Ytijk  ■M+At  +  Bi  +  Cj+®k  +  ABti  +  ACtj  +  BCtj  +  ADtk  +  BDlk  + 


CDjk  + 


ABC 


tij 


+  "“tjk  + 


BCDijk  +  ABCDtijk 


The  tool  life  was  obtained  for  each  of  the  lb  different  treatment 
combinations  at  4  different  cutting  speeds.  A  computerized  regression 
analysis  gave  a  tool  life  vs.  cutting  speed  relationship  of  the  form 
V  ■  V.  Tn.  Where  T  *  tool  life  (minutes),  V  -  cutting  speed  (surface 
speed1  of  workpiece  in  feet  per  minute),  V,  -  cutting  speed  for  1  minute 
tool  life,  and  n  ■  a  determined  exponent.  Estimates  of  V_q  (the  cutting 
speed  corresponding  to  a  20  minute  tool  life)  was  obtained  from  these 
equations. 

These  estimates  of  V„Q  are  presented  in  Table  I.  This  data  was 
then  used  in  a  formal  analysis  of  variance  using  Yates '  technique 
(Table  II) . 

The  Yates'  Technique  gives  the  sums  of  squares  for  all  the  effects 
without  the  need  of  memorizing  or  looking  up  any  equations  and,  thus,  is 
a  powerful  tool  for  analysis  of  variance.  The  ANOVA  table  is  shown  in 
Table  III.  The  4-three  factor  and  the  four  factor?  interactions  have  been 
pooled  to  form  a  residual  term  with  5  degrees  of  freedom.  This  is  justified 
in  this  case  since  all  of  these  terms  are  of  the  same  order  of  magnitude. 
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TABLE  ZZX 
AN  OVA  Table 


Source 

SS  OF 

MSR 

A  (tool  material) 

76,867.5625  1 

2,057.345*** 

B  (fluid) 

95,0625  1 

2.544 

C  (Application 
Method) 

232.5625  1 

6.224* 

D  (Concentration) 

451.5625  1 

12.086* 

AB 

138.0625  1 

3.695 

AC 

1207.5625  1 

32.320** 

AO 

52.5625  1 

1.407 

BC 

27.5625  1 

.738 

BD 

410.0625  1 

10.975* 

CD 

138.0625  1 

3.695 

ABC  22.5625 

ABO  18.0625 

186.8125  5 

ACD  18.0625 

BCD  60.0625 

186.8125  »  37.3625 

ABCD  68.0625 

5 

Fl,5, .95  =  6,61 

F  8  16.26  F 

1,5,. 99  1,5,. 

=  47.18 

999 

*»ignl£lcant  at  95%  confldanca  laval 
**elgnlflcant  at  99%  confidence  level 
***iignl£lcant  at  99.9%  confidence  level 
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INTERPRETATION  OF  ANOVA  TABLE.  The  significant  AC  (tool  material 
X  application  method)  interaction  indicates  that  the  application  method 
best  for  one  tool  material  may  not  work  well  on  the  other  tool  material. 

Also,  the  high  BD  (fluid  X  concentration)  interaction  indicates  that  the 
best  concentration  depends  upon  the  fluid  used. 

The  cutting  fluid  type  (Factor  B)  appears  to  be  a  relatively 
unimportant  factor  compared  with  the  application  method  and  the  concen¬ 
tration. 

The  tool  material  (Factor  A)  was  a  very  highly  significant  factor, 
as  expected,  since  carbide  and  cast  alloy  are  quite  different  in  character. 

This  factor  was  so  dominant  that  it  appeared  to  be  desirable  to  analyze  the 
data  for  carbide  and  cast  alloy  as  two  separate  experiments.  This  was  done, 
and  the  results  of  this  analysis  are  presented  in  Table  IV  and  V,  respectively. 

INTERPRETATION  OF  ANOVA  TABLE  FOR  CARBIDE.  Analysis  of  data  using 
carbide  ttools  shows  that  all  of  the  main  effects  were  formally  significant 
in  the  following  order: 

1.  Factor  D  -  (Concentration) 

2.  Factor  B  -  (Fluid  type) 

3.  Factor  C  -  (Application  method) 

The  best  combination  for  carbide  was  flood  application  of  fluid  A  at 
the  20:1  concentration. 

INTERPRETATION  OF  ANOVA  TABLE  FOR  CAST  ALLOY.  Considering  the  cast 
alloy  tool  material  alone,  only  Factor  C  (method  of  fluid  application) 
was  formally  significant.  Mist  application  was  much  better  with  cast  alloy 
tools . 

CONCLUSION.  As  this  paper  clearly  illustrates,  statistical  design 
and  analysis  can  be  effectively  used  in  metal  cutting  experiments.  The 
factorial  design  is  particularly  well  suited  to  these  experiments.  Yates' 
Technique,  applied  to  a  factorial  experiment,  is  not  difficult  and  can  be 
carried  out  without  any  computational  equipment. 
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TABLE  IV 


Yates '  Table  (Carbide) 

SS 


Treatment 

Yield 

(1) 

(2) 

(3) 

(3)2  ♦  8 

(1) 

231 

463 

899 

1741 

b 

232 

436 

842 

-43 

231.125 

c 

217 

427 

3 

-39 

190.125 

be 

219 

415 

-46 

1 

.123 

d 

225 

1 

-27 

-57 

406.125 

bd 

202 

2 

-12 

-49 

300.125 

cd 

219 

-23 

1 

15 

28.125 

bed 

196 

-23 

0 

-1 

.125 

total 

1741 

1155.875 

ANOVA  TABLE  (Carbide) 


Effect . 

SS 

DF 

HSR 

B 

231.125 

1 

1849* 

C 

190.125 

l  . 

1521* 

D 

406.125 

1 

3249* 

BC 

.125 

1 

1 

BD  .... 

300.125 

.  1 

2401* 

CD 

28.125 

1 

225* 

BCD 

.125 

1 

Fl,l, .95  :  161,4 


B  -  Fluid 

Factors  C  -  Application  method 
D  «  Concentration 
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TABLE  V 


Yates'  Table  (Cast  Alloy) 


X££l£Ot&£ _ LL&14 _ IJJ _ L2^ _ (312  ;  8 


(1) 

68 

148 

330 

632 

b 

80 

182 

302 

4 

2 

c 

88 

118 

18 

100 

1250 

be 

184 

-14 

20 

50 

d 

69 

12 

34 

-28 

98 

bd 

49 

6 

66 

-32 

128 

cd 

89 

-20 

-6 

32 

128 

bed 

95 

6 

-26 

32 

128 

total  632  1784 


AMOVA  Table  (Cast  Alloy) 


Effect 

SS 

DF 

MS 

MSR 

B 

2 

1 

2 

.018 

C 

1250 

1 

1250 

11.521* 

D 

98 

1 

98 

.903 

BC  50 

BD  128 

434 

4 

108.5 

... 

CD  128 
BCD  128 

Fl,4, .95  =  7>?l 

B  -  Fluid 

Factors  C  -  Application  nethod 
D  -  Concentration 
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ABSTRACT .  Measurement  of  One  Aspect  of  Vehicular  Mobility. 

Measurements  of  vehicular  mobility  have  usually  been  conducted  as 
"go-no-go"  tests,  in  which  vehicles  are  matched  against  obstacles  until 
they  can  no  longer  proceed,  or  "jury  system"  tests  which  rely  upon 
qualitative  judgments  based  on  opinions  of  observers  and/or  drivers  of 
the  vehicles  under  test.  As  a  new  approach  this  project  investigates 
the  feasibility  of  using  a  statistically  designed  test  which  is 
reasonably  unbiased  and  provides  some  measurement  of  precision  for 
evaluating  mobility  of  the  vehicles. 

The  paper  describes  the  design  problems  presented  for  developing 
a  test  program,  the  experimental  design  selected,  the  field  conduct 
of  the  test  program,  and  results  of  the  test.  Test  data  were  limited 
to  time  required  for  a  vehicle-driver  combination  to  traverse  a  pre¬ 
scribed  course.  The  report  covers  a  total  of  450  runs,  using  18  drivers, 
ten  vehicles,  and  27  test  courses  over  three  different  terrains. 
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INTRODUCTION 


Mobility  has  long  been  a  major  aspect  of  consideration  in 
warfare.  In  the  year  218  B.  C.,  Hannibal  crossed  the  Alps  and 
subsequently  won  the  first  of  many  battles  from  the  Romans. 

In  addition  to  horses  Hannibal  utilized  a  few  elephants  which 
apparently  increased  his  overall  mobility  of  materiel. 

With  the  advent  of  motorized  vehicles  considerable  progress 
was  made  in  the  transportation  of  men  and  materiel.  This 
progress  was  due  in  groat  part  to  the  roads  and  highways  which 
were  built  as  part  of  the  transportation  complex.- 

Roads  are  often  not  available  to  supply  front  line  troops 
during  wartime  or  for  other  use  during  national  emergencies.  In 
recent  years  then,  a  prime  consideration  in  the  design  of  a 
military  vehicle  has  been  off-the-road  mobility. 

Measurements  of  vehicular  mobility  generally  have  been 
grouped  into  two  types,  the  "go-no-go"  and  the  "jury  system". 

In  the  "go-no-go"  type,  the  vehicles  are  pitted  against  various 
obstacles-  ditches,  steep  inclines,  swamps,  etc.,  until  they 
can  wo  longer  proceed.  The  "jury  system"  uses  the  combined 
opinions  of  the  drivers  and  observers  for  evaluation.  These  tests 
give  useful  results  but  are  subject  to  certain  weaknesses.  For 
example,  the  courses  are  usually  well  defined,  not  properly 
replicated,  and  performance  of  a  vehicle  can  be  greatly  influenced 
by  the  driver. 

As  a  new  approach,  this  project  investigated  the  feasibility 
of  using  a  statistically  designed  test  which  is  reasonably  un¬ 
biased  and  provides  some  measurement  of  precision  for  evaluating 
mobility  of  the  vehicles  in  a  tactical  cross-country  situation. 

In  a  tactical  situation,  the  driver  often  may  not  he  familiar 
with  the  area,  and  paths  to  follow  are  not  defined.  Roads  may  be 
mined.  The  driver  may  avoid  obstacles  if  possible,  and  the 
time  required  to  reach  a  destination  may  be  an  important  factor 
for  the  successful  completion  of  a  mission. 

CONST  DURATIONS  OF  THE  TEST 

At  an  early  stage  in  the  development  of  the  statistical 
design,  some  basic  issues  were  resolved. 

1.  These  tests  were  intended  to  measure  only  one  aspect 
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of  mobility.  This  was  the  time  required  for  a  vehicle- dr iver 
combination  to  traverse  from  point  A  to  point  b  where  the 
course  is  defined  only  hy  the  points  A  and  B  except  where  auxi 
liary  markers  may  be  needed  to  keep  the  driver  on  course. 


2. 

prac  t ica 
all  the 
only  one 
wise  a  1 
versed  a 
many  dri 
requ  ired 


The  experimental  unit  was  the  cou 
1  to  provide  the  number  of  courses 
desired  tests  and  still  have  the  d 
traverse  of  a  course.  This  aspec 
earning  factor  would  he  introduced 
course  more  than  once.  As  an  alt 
vers  could  be  used  and  thus  reduce 


rse.  It  was  not 
required  to  perform 
rivers  limited  to 
t  was  desired;  other- 
when  a  driver  tra- 
ernative  perhaps 
the  number  of  courses 


3.  The  courses  selected  would  be  about  the  same  length, 
approximately  measured,  and  not  accurately  surveyed.  A  course 
length  of  somewhere  between  5  and  20  miles  seemed  reasonable. 
[Examination  of  the  data  showed  that  the  actual  lengths  varied 
from  0.6  mile  in  Terrain  III  to  2.7  miles  in  Terrain  I, 
approximately] . 

4.  The  tests  were  to  he  conducted  in  Nevada  with  the 
cooperation  of  the  Nevada  Automotive  Test  Center.  Three  types 
of  terrain  were  selected  to  give  greater  meaning  to  any  results 
or  conclusions  obtained.  The  terrains  were  defined  as  follows: 

a.  Terrain  I:  Flat  and  open  with  small  irregularities 
in  the  form  of  dry  washes,  and  scattered  areas  of  sagebrush  one 
to  two  feet  in  height.  Obstacles  were  minor  in  nature. 


b.  Terrain  II:  Hilly  and  open  with  rolling  hills, 
and  areas  of  deep  washes  and  sheer  drops.  This  area  contained 
outcrops  of  rock  and  scattered  areas  of  sagebrush  similar  to 
Terrain  I. 


c.  Terrain  III:  Hilly  and  timber  covered.  Areas  of 
trees  were  scattered  between  open  spaces  of  sagebrush  and  grass. 
The  trees  were  closely  spaced  pine  ranging  between  five  and 
twenty-five  feet  tall.  This  was  the  most  difficult  of  the 
three  terrains. 


5.  The  supply  of  drivers  was  not 
the  supply  of  experienced  drivers  was 
driver  was  classified  as  experienced  o 
own  statements  as  to  his  ability  and/o 
the  highway  and  cross  country. 


a  problem.  However, 
limited.  By  definition,  a 
r  novice  according  to  his 
r  experience  to  drive  on 
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0.  It  was  planned  that  the  drivers  would  he  instructed 
to  traverse  the  course  at  the  fastest  speed  they  felt  they 
could  go  without  damaging  the  vehicle  or  injuring  personnel. 

7.  A  referee  was  to  ride  in  each  vehicle.  The  referee 
was  the  official  timekeeper,  lie  would  also  record  any  other 
information  that  might  effect  interpretation  of  the  data.  For 
example,  a  driver  may  become  bogged  down,  or  lost,  or  the 
vehicle  may  not  be  performing  properly.  The  referees  were 
also  responsible  for  the  safety  of  the  vehicles  and  occupants 
by  having  the  driver  avoid  any  maneuver  which  could  result 

in  damage  to  the  vehicle  or  occupants.  The  referees  were 
to  be  familiar  with  the  particular  courses  to  which  assigned. 

8.  Nine  or  ten  vehicles  were  expected  to  be  available  for 
this  test.  The  ones  used  would  be  those  available  at  the  time 
of  the  test. 

CONSIDERATIONS  IN  THE  DESIGN 

The  primary  interest  in  these  tests  was  to  determine  if  a 
designed  experiment  could  be  useful  for  evaluating  factors 
that  affect  the  mobility  of  vehicles.  This  objective  could  be 
met  if  it  were  possible  to  design  a  test  which  could  differentiate 
between  vehicles,  at  a  specified  confidence  level.  Any  other 
information  obtained  would  be  useful  for  designing  future  tests. 

Considerations  were  as  follows: 

1.  Vehicle  effect 

2.  Course  effect 

3.  Driver  effect 

a.  experienced 

b.  novice 

4.  Terrain  effect 

5.  Marking  of  courses 

6.  Order  of  testing.  The  tests  were  expected  to  require 

several  weeks.  The  weather  could  he  a  factor. 

7.  Tracks  left  from  a  previous  run  on  the  same  course. 

8.  Referee  effect 

9.  Interaction  effects 

a.  Vehicle  -  course 

b.  Driver  -  course 

c.  Vehicle  -  driver 

d.  Vehicle  -  terrain 

e.  Driver  -  terrain 


SELECTION  OF  THE  DESIGN 


In  the  considerations  of  the  design  there  were  at  least 
five  major  factors  that  had  to  be  accounted  for  in  the  design. 
These  were  order  of  testing  (runs),  courses,  vehicles,  drivers, 
and  terrain.  The  other  factors  would  have  to  be  controlled 
by  conducting  the  test  with  care  or,  considered  not  significant. 
Comments  are  as  follows: 

1.  Marking  of  the  courses  should  present  no  problem  in 
Terrain  I  but  in  the  hilly  and/or  timber  covered  Terrains  II 
and  III  care  should  be  exercised  so  that  a  driver  could  easily 
determine  the  course  by  following  the  check  point  markers. 

2.  After  a  course  was  used  once  there  would  then  be  a 
path  to  follow.  It  was  decided  that  before  a  test  run  was 
made,  each  course  would  be  traversed  once.  In  addition,  each 
course  would  have  two  or  three  false  trails  at  the  start. 

The  purpose  of  this  was  to  give  the  first  driver  an  environment 
similar  to  that  of  the  following  drivers.  Drivers  were  instructed 
not  to  follow  previous  tracks  unless  absolutely  necessary. 
Generally  there  were  no  roads  to  follow  but  in  case  a  driver 
did  come  across  an  established  road  he  was  instructed  to  assume 
it  was  mined,  in  which  case  his  maximum  speed  could  not  exceed  the 
two  or  three  miles  per  hour  of  mine  sweeping  operations. 

3.  The  referee  effect  was  to  he  controlled  by  careful 
selection  and  uniform  instruction  to  those  selected  as  referees. 
Also,  the  referees  were  to  establish  the  courses  so  they  could 
become  familiar  with  them  before  the  tests  were  started. 

4.  Each  course  could  have  been  laid  out  across  all  three 
terrain  types.  This  would  still  satisfy  the  primary  objective 
of  the  experiment,  but  it  would  give  no  information  on  terrain 
effect  nor  on  the  interaction  effects  of  vehicle- terrain  and 
driver- terrain . 

5.  One  way  to  cope  with  a  problem  of  this  size  is  to  adopt 
the  Graeco-Latin  square  as  the  basic  structure  for  the  experi¬ 
mental  plan.  With  this  choice  only  four  factors  can  be  used. 

The  basic  structure  would  include  runs,  courses,  vehicles, 

and  drivers.  To  obtain  any  evaluation  of  terrain  effect,  each 
square  would  have  to  be  repeated  for  each  terrain.  The  Latin- 
square  and  Graeco-Latin  square  have  the  limitation  that  no 
interaction  effects  can  be  measured.  It  seems  reasonable  that 


there  probably  are  some  intcr-action  effect.  If  present,  these 
effects  would  inflate  the  error  sum  of  squares  and  decrease  the 
sensitivity  of  the  test.  In  retrospect,  one  driver  was  unable 
to  complete  some  of  the  runs  in  Terrain  III  because  of  his 
inability  to  handle  the  vehicle  on  these  courses.  Never¬ 
theless,  it  was  assumed  that  interaction  effects  would  not 
seriously  affect  the  analysis  and  the  Graeco- bat  in  square 
was  adopted  as  an  acceptable  design  for  this  experiment. 

6.  Information  was  desired  on  experienced  driver  versus 
novice  driver.  The  test  was  designed  such  that  nine  of  each 
were  assigned.  Drivers  were  randomly  divided  into  two  groups 
with  the  requirement  that  one  group  contain  four  experienced 
and  five  novice  drivers  and  the  second  group  contain  five 
experienced  and  four  novice  driveis.  Each  group  was  then 
assigned  to  either  the  first  or  second  square  for  each  terrain. 

THE  GRAECO- LATIN  SQUARE 

A  Graeco-Latin  square  of  side  N  is  defined  as  a  square 
layout  of  N  rows  and  N  columns  with  N  Latin  and  N  Creek  letters 
filling  the  n2  cells  with  the  following  restriction:  each 
letter  (Latin  or  Greek)  may  appear  only  once  in  each  row  and  once 
in  each  column,  and  each  Latin-Greek  combination  may  appear 
only  once.  Graeco-Latin  squares  do  not  exist  for  all  sizes. 

A  square  of  size  six  is  not  known.  One  of  side  ten  was  only 
recently  determined. 

In  this  experiment  the  Latin  treatment  represents  vehicles 
and  are  designated  by.  capital  letters.  The  Greek  treatment 
represents  drivers  and  are  designated  by  numbers.  Rows  and 
columns  represent  .'order  of  run  and  course,  respectively. 

Correct  randomization  procedures  must  be  used  when  con¬ 
ducting  the  experiment  uriing  a  Graeco-Latin  square  design. 

The  general  procedure  is  as  follows:  Randomly  select  a 
square  of  the  size  required  from  a  listing  of  the  squares  that 
are  different  from  one  another;  that  is,  they  are  not  con¬ 
vertible  into  one  another  by  permuting  rows  and/or  columns. 

After  selection  of  a  basic  square,  the  rows  are  permuted  randomly, 
then  the  columns  are  permuted  randomly,  finally,  the  Latin  and 
Greek  treatments  are  randomly  assigned. 
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PROBLEMS  ENCOUNTERED  DURING  THE  TEST 


1.  It  was  anticipated  that  some  drivers  would  become  dis¬ 
oriented  while  traversing  a  course.  One  task  of  the  referee  was 
to  prevent  this  when  it  appeared  the  driver  was  in  the  process 
of  becoming  lost.  There  were  a  few  incidents  of  this  nature, 
including  one  where  the  referee  also  became  disoriented.  These 
incidents  were  recorded  by  the  referee.  Upon  completion  of 

the  test  the  project  engineer  and  the  referee  discussed  the 
individual  incidents  and  made  a  decision  whether  or  not  to 
accept  the  elapsed  time  as  a  data  point  or  discard  the  data  as  an 
outlier.  In  general,  when  a  driver  became  lost  for  more  than 
four  or  five  minutes,  that  time  datum  was  rejected,  since  this 
was  not  the  fault  of  the  vehicle,  and  the  vehicle  was  the  factor  of 
primary  interest.  A  discarded  test  run  was  not  rerun. 

2.  In  a  few  instances,  a  vehicle  bogged  down.  Again, 
when  excessive  time  was  required  for  the  vehicle  to  again  get 
under  way,  the  time  datum  for  that  run  was  rejected. 

3.  A  driver-terrain  interaction  effect  or  more  precisely, 
a  driver-course  interaction  effect  became  evident  during  the 
test.  In  particular,  one  driver  lost  confidence  in  controlling 
some  of  the  vehicles  during  the  tests  in  Terrain  III,  In  these 
instances,  the  referee  had  to  drive  the  vehicle  back  to  camp. 

Since  the  time  datum  for  these  runs  were  not  used  any  analysis 
for  driver-terrain  interaction  would  be  biased. 

4.  Some  of  the  courses  within  the  same  terrain  were 
more  difficult  to  negotiate  than  others  in  the  same  terrain. 
Differences  in  vehicles  contributed  to  an  apparent  interaction 
effect.  For  example,  Vehicle  I  was  an  armored  car,  and  this  vehicle 
has  a  high  center  of  gravity  which  could  be  dangerous  in  the 

hilly  courses  of  Terrains  II  and  III.  Two  vehicles  were  driven 
with  the  hatch  closed  and  vision  was  limited  to  that  obtainable 
through  the  vision  blocks.  Conditions  of  this  nature  did 
result  in  a  few  uncompleted  runs  (as  previously  mentioned),  or 
data  which  were  subsequently  not  used. 

5.  The  referees  did  not  react  equally  to  hazardous 
situations.  During  off-duty  hours,  the  drivers  would  discuss 
actions  of  the  referees.  Thus  the  drivers  obtained  an  insight 
into  how  a  referee  would  react  under  certain  conditions.  As  a 
result,  the  drivers  had  a  tendency  to  modify  their  driving 
according  to  who  the  referee  was. 
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6.  The  referees  also  were  not  uniform  in  controlling  the 
test  when  a  driver  wandered  off  course.  The  time  allowed 
before  a  referee  gave  the  driver  instructions  in  these  cases 
apparently  varied  considerably.  Since  the  data  obtained  were 
the  times  required  to  traverse  the  courses,  the  referees  did 
influence  the  outcome  of  the  tests. 

7.  Drivers  were  instructed  not  to  follow  trails  left  by 
previous  vehicles.  By  the  time  the  second  square  was  begun  courses 
were  covered  with  trails  and  it  became  increasingly  more  difficult 
to  keep  the  drivers  off  these  trails.  The  subsequent  analysis  of 
variance  data  did  not  show  a  significant  run  effect  at  the 

five  percent  significance  level. 

8.  Vehicle  HI,  a  5000-gallon  fuel  tanker- truck  was  with¬ 
drawn  from  the  test  after  completion  of  the  runs  of  Terrain  I. 

This  vehicle  was  difficult  to  control  over  the  basically  flat  terrain 
of  Terrain  I.  It  was  judged  best  for  the  safety  of  the  drivers 
and  vehicle  not  to  use  that  vehicle  for  Terrains  II  and  III.  A 
1-1/4-ton  cargo  truck  designated  E2  replaced  Vehicle  El  for 
Terrains  II  and  III. 

9.  There  were  instances  of  mechanical  breakdown  of  a 
vehicle  during  a  test  run,  which  required  varying  amount  of 
time  to  repair.  There  were  also  instances  when  a  vehicle  per¬ 
formed  below  par.  This  again  would  result  in  a  judgment  by  the 
project  engineer  and  referee  whether  to  accept  or  reject  the 
time  datum  for  that  run. 

10.  Because  of  mechanical  difficulties  Vehicle  C  proved 
inadequate  in  Terrain  III.  Tests  with  this  vehicle  were  stopped 
after  the  first  square  in  Terrain  III.  A  replacement  vehicle 
was  not  available  so  an  8  X  8  Graeco-Latin  Square  had  to  be 
designed  for  the  second  square  of  Terrain  III,  in  lieu  of  the 
9X9  size  used  for  the  previous  five  squares. 

11.  Vehicle  I  broke  an  axle  and  did  not  finish  tests  in 
Square  2  of  Terrain  III. 

THE  DATA 

Data  to  be  analyzed  were  data  for  Squares  1  and  2  for  each 
of  Terrains  I,  II,  and  III. 

,'ata  for  one  of  these  six  Graeco-Latin  squares  is  shown  in 
Table  I.  The  small  squares  indicate  where  data  are  missing.  The 
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minimum  and  maximum  times  required  for  traversing  a  course  are 
shown  within  the  two  circles.  The  extremes  for  this  square 
give  a  range  of  5.1  to  52.8  minutes.  Other  recorded  data  were 
used  to  compute  average  vehicle  speed  and  the  average  vehicle 
miles  driven  per  course.  A  summary  of  these  data  are  shown 
in  Table  II.  The  courses  as  laid  out  were  much  shorter 
than  originally  suggested.  The  courses  are  listed  as  miles 
driven  rather  than  length.  In  Terrain  III  especially  some  of 
the  larger  vehicles  with  a  large  turning  radius  had  to  detour 
around  some  obstacles  that  smaller  vehicles  could  negotiate. 

The  overall  data  obtained  for  analysis  showed  t lie  following 

Terrain  I,  Square  1  had  one  empty  cell. 

Terrain  1,  Square  2,  Terrain  II,  Squares  1  and  2  were 
complete . 

Terrain  III,  Square  1  had  eleven  empty  cells, 

Terrain  III,  Square  2  had  seven  empty  cells  in  the 
3  X  8  square. 

ANALYSIS  OF  Till:  DATA 


Analysis  of  Variance  Tables  for  the  squares  having  no 
empty  cells  were  computed  in  a  straightforward  manner.  Analyses 
were  performed  in  two  ways  for  the  three  squares  having  empty 
cells,  The  first  analyses  were  obtained  by  estimating  the 
missing  data,  then  performing  the  standard  analysis  of  variance 
computations.  This  method  results  in  an  upward  bias  for  the 
treatment  sum  of  squares,  so  the  data  were  also  analyzed  by 
regression  analysis  to  obtain  an  unbiased  value  for  the  treat¬ 
ment  sum  of  squares, 

To  determine  the  missing  values,  these  missing  data  were 
designated  as  a,  h,  c,  ...  etc.  Then  steps  were  set  up  for 
an  analysis  of  variance.  The  error  sum  of  squares  is  defined 
in  the  usual  manner;  that  is,  it  is  the  remainder  after  the 
treatments  sums  of  squares  are  subtracte  1  from  the  corrected 
total  sum  of  squares.  The  error  sum  of  squares  is  thus 
determined  in  terms  of  the  unknowns.  Partial  differentiation 
is  performed  on  the  error  term  with  respect  to  each  of  the 
unknown  missing  values  and  derivatives  are  set  equal  to  zero. 

The  resulting  set  of  equations  is  solved  for  the  missing  values. 
Since  the  error  term  was  minimized,  rhe  remainder  sum  of  squares 
is  unbiased. 


This  analysis  was 
design  model: 


also  obtained  using  the 


experimental 


P 

y  *  T.  3 1 X  i  +  e  i  -  1,  2,  ....  37 

i  =  1  (for  9X9  square) 


This  is  a  general  linear  model  of  less  than  full  rank. 
The  Xj's  take  only  values  of  0  or  1.  In  the  9X9  sauare, 
the  9  levels  for  each  of  the  four  factors,  plus  for  the 
mean,  gives  an  X  matrix  of  size  n  X  37  with  n  equal  to  the 
number  of  observations.  Square  1  of  Terrain  ITT,  with  eleven 
missing  values  gave  an  X  matrix  of  size  70  X  37,  and  §  was 
solved  from  the  normal  equations  for  this  model  of 

xtxb  -  xty 


A  solution  was  derived  by  arbitrarily  equating  to  zero 
the  Bj's  corresponding  to  the  ninth  level  for  each  of  the 
four  factors,  partitioning  the  matrices,  and  solving  the 
reduced  X  matrix  of  size  70  X  33,  which  was  of  full  rank, 
(reparametrization) .  Now,  one  of  the  conditions  that  may  be 
applied  in  solving  the  regression  equation  is  that  the  sum  of 
the  P's  for  each  factor  is  eqyal  to  zero.  A  linear  trans¬ 
formation  was  ^imposed  on  the  0's  to  meet  this  condition  as 
follows:  The  0's  for  each  Factor  were  summed,  the  result 
dividgd  by  nine,  and  this  amount  subtracted  from  each  of  the 
nine^B's.  The  general  mean  was  also  adjusted  the  same  amount. 
The  P's  or  b's  were  thus  departures  from  mean  time  and  could 
he  interpreted  directly.  A  large  negative  b  meant  that  this 
level  of  the  factor  had  the  effect  of  traversing  the  course 
in  a  much  shorter  time  than  the  average  time. 

Additional  computations  were  performed  on  the  squares 
with  missing  data  to  obtain  the  sum  of  squares  for  vehicles 
and  drivers  for  an  ANOVA  table. 

The  items  of  main  interest  were  vehicle  effects  and  driver 
effects.  For  these  effects,  differences  of  the  means  were 
tested  using  Duncans  Multiple  Range  Test  at  the  five  percent 
significance  level. 

RESULTS 


Primary  analyses  of  the  data  were  summarized  in  ANOVA 
tables  for  the  six  squares.  One  of  these  tables  is  shown  as 
Table  III.  It  is  noted  that  only  vehicles  and  drivers  were 
randomized.  The  courses  are  assumed  to  he  a  random  sample  from 
a  population  of  courses.  Then  a  significance  test  for  courses 
is  valid.  Runs  cannot  be  randomly  assigned.  Thus  the  sum  of 
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squares  for  runs  can  he  computed  but  a  significance  test  for 
runs  is  not  valid.  The  magnitude  of  the  ratio  of  mean  square 
for  runs  to  remainder  mean  square  was  small  for  all  six  squares. 
When  tested  at  a  significance  level  of  five  percent,  the  course, 
vehicle,  and  driver  effects  were  always  significant  at  this 
level  except  in  two  cases  and  in  these  two  cases  the  signi¬ 
ficance  level  was  less  than  10  percent.  Results  of  the  analyses 
are  summarized  in  Table  IV,  which  shows  the  F-ratios  and  their 
respective  significance  levels. 

Although  two  squares  were  run  in  each  terrain,  true  re¬ 
plication  was  not  obtained  because  a  different  set  of  drivers 
were  used  in  each  square.  Under  the  assumption  that  each  set 
of  nine  drivers  per  square  were  approximately  equal,  an  analysis 
of  variance  was  made  on  the  combined  Squares  1  and  2  for 
Terrains  I  and  II.  The  squares  of  Terrain  III  were  not  combined. 
Combining  the  squares  made  the  tests  more  sensitive  for 
differences  between  vehicles. 

A  comparison  of  experienced  versus  novice  drivers  was 
made  by  partitioning  the  sum  of  squares  for  drivers.  The  F- 
ratios  did  not  show  a  significant  difference  between  experienced 
and  novice  drivers  for  any  square,  nor  for  the  combined  squares, 
at  the  five  percent  level, 

Application  of  Duncan's  Multiple  Range  Test  applied  to  the 
means  gave  separation  of  vehicles  into  groups  which  were 
significantly  different  from  one  another,  at  the  five  percent 
significance  level.  Sec  Table  V.  Some  vehicles  fall  between 
two  adjacent  groups  and  cannot  be  considered  different  from 
either  group.  Those  vehicles  are  indicated  by  connecting  lines 
to  the  main  groups  in  the  table.  For  example  in  the  upper  left 
square,  the  group  D,  F,  and  A  was  the  fastest,  followed  by 
the  group  B,  I,  G,  C,  and  then  vehicle  FI.  Vehicle  II  can  he 
associated  with  either  of  the  two  groups  indicated. 

COMBIN’D!)  ANAL YSIS  FOR  TERRAINS 

The  vehicle  mobility  test  was  designed  around  the  indi¬ 
vidual  Graeco- Lat in  Square.  It  was  not  designed  so  the  six 
squares  over  the  three  terrains  could  be  pooled  in  a  straight¬ 
forward  manner.  Any  analysis  over  the  three  terrains  is  further 
complicated  by  the  missing  values  in  Terrain  III,  and  grouping 
of  the  drivers  into  experienced  or  novice  drivers.  Main  items 
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of  interest  in  combining  the  terrains  are  measures  of  the 
relative  performance  of  (1)  vehicles  over  terrains  and  (2)  dTivers 
over  terrains. 

(1)  Vehicles  Over  Terrains 


The  problem  of  missing  values  was  minimized  by 
using  only  data  on  vehicles  for  which  information  was  available 
for  all  six  squares.  A  table  of  means  of  size  6X7  representing 
six  squares  and  seven  vehicles  was  obtained  by  omitting  data 
for  vehicles  C,  El,  and  E2.  See  Table  VI.  The  table  is  still 
slightly  biased  because  of  the  inclusion  of  estimated  values 
for  missing  data  on  individual  runs.  The  bias  was  not  considered 
serious  because  an  estimated  value  in  general  was  incorporated 
into  an  average  of  eight  or  nine  numbers. 

An  analysis  of  variance  was  performed  on  the  table  of  means 
for  vehicles.  This  data  is  shown  in  Table  VII.  The  F-ratios 
confirm  that  terrain  and  vehicle  effects  were  highly  significant. 
The  main  item  of  interest  in  this  Table,  the  vehicle  X  terrain 
interaction  effect,  had  on  F-ratio  of  0.93  and  thus  was  not 
significant. 

With  the  understanding  that  squares  are  to  be  thought 
of  as  replicates,  for  vehicles  at  least,  the  entries  in  the 
ANOVA  for  [Sq  1]  versus  [Sq  2]  and  [Sq  1]  minus  [Sq  2]  may  be 
used  as  some  measure  of  "learning".  This  is  a  "pseudo- learning" 
since  a  different  set  of  drivers  was  always  used  in  the  second 
square  for  each  terrain.  It  does  indicate,  however,  that  drivers 
were  able  to  increase  speeds  by  utilizing  evidence  of  trails 
from  earlier  runs.  The  [Sq  1]  verus  [Sq  2]  mean  square  provides 
an  estimate  of  "learning"  over  the  whole  experiment  (all  three 
terrains).  The  [Sq  1]  minus  [Sq  2]  comparison  provides  an 
estimate  of  the  variation  in  this  learning  from  Square  1  to 
Square  2  within  each  terrain.  In  both  cases  the  probability 
of  these  F-ratios  occurring  by  chance  under  H0  is  less  than 
0.005.  The  significant  "learning"  effect  appears  to  be 
contradictory  to  the  conclusion  of  no  run  effect  within  each 
square.  That  is,  if  this  "learning"  effect  is  the  result  of 
tracks  or  trails  left  from  the  previous  vehicle,  then  the 
"learning"  effect  should  commence  immediately  after  the  first 
run . 


The  run  effect  for  the  six  squares  was  investigated 
further  as  follows:  First,  the  run  totals  were  plotted  with 
the  order  of  runs  as  the  abscissa.  A  least  squares  linear 
regression  line  was  added.  Although  the  points  appeared 
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scattered  the  slopes  were  negative  for  five  of  the  six  squares, 
indicating  less  time  to  traverse  a  course  as  the  run  number 
increased.  But  one  square  had  a  positive  slope. 

Analysis  of  variance  tables  were  constructed  to  show  the 
reduction  in  sum  of  squares  due  to  linear  regression,  with  one 
degree  of  freedom,  and  the  deviation  from  linear  regression, 
with  seven  degrees  of  freedom  for  the  9X9  squares  and  six 
degrees  of  freedom  for  the  8X8  square.  If  the  order  of  runs 
is  a  real  effect,  then  the  mean  square  ratios  for  reduction  in 
sum  of  squares  due  to  linear  regression  should  be  large.  A 
tabulation  of  these  ratios  for  the  six  squares,  Terrain  J, 
Square  1  through  Terrain  Til,  Square  2,  follows: 


d  . 

f . 

RE!).  M.S./REM. 

1, 

47 

6.16 

1, 

48 

0.46 

1, 

48 

1.10 

1. 

48 

4.02 

1, 

37 

0.34 

1, 

28 

2.68 

These  ratios  do  not  give  strong  support  for  the  conclusion 
that  drivers  used  evidence  of  previous  trails  to  reduce  their 
traverse  time  within  a  square.  A  possible  explanation  of  the 
pseudo-learning  between  squares  is  that  during  the  runs  of  the 
first  square  the  drivers  were  able  to  comply  with  the  requirement 
of  not  using  trails  left  from  previous  runs;  however,  compliance 
to  this  requirement  broke  down  during  tests  for  the  second  square. 

Performance  of  vehicles  over  the  three  terrains  was  shown 
graphically  by  first  subtracting  the  means  for  the  individual 
square  from  the  vehicle  means.  The  pel formance  for  the  vehicles 
were  then  algebraically  added  over  the  six  squares.  The 
departure  from  mean  time  for  each  vehicle  could  then  be 
plotted  as  shown  in  Figure  1.  Vehicle  F  shows  the  best  overall 
performance.  This  vehicle  is  the  1/4-ton  M151  A1  Utility  Truck. 
This  is  a  jeep  type  vehicle.  Vehicle  D  was  the  commercial 
Kaiser  jeep.  Vehicle  FI  was  the  slowest  vehicle.  This  was 
a  16-ton  payload  vehicle  with  an  unusual  control  system  and  no 
suspension  other  than  the  tires.  In  Figure  1  it  must  be 
remembered  that  vehicle  El  was  used  only  in  Terrain  I,  Vehicle  E2 
was  used  only  in  Terrains  II  and  III,  and  Vehicle  C  was  not  used 
in  square  2  of  Terrain  III. 
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(2)  Drivers  Over  Terrains 

There  was  no  replication  of  drivers  within  terrains; 
hence,  no  direct  test  for  driver  X  terrain  interaction  can  be 
obtained  if  an  ANOVA  is  performed  on  a  table  of  means  for 
drivers.  An  ANOVA  operation  was  computed  and  the  mean  square 
term  for  driver  X  terrain  interaction  was  77,915.  The  magni¬ 
tude  of  this  term  is  not  large  in  reiation  to  other  relevant 
mean  squares  obtained  from  the  data.  The  remainder  sum  of 
squares  term  in  the  previous  ANOVA  table  for  vehicles  over 
terrains  was  22,325.  A  denominator  of  this  magnitude  for 
determination  of  the  F-ratio  would  indicate  that  the  driver  X 
terrain  interaction  effect  is  significant  at  about  the  0.025 
level.  The  remainder  sum  of  squares  for  the  six  basic  squares, 
however,  ranged  from  35,  933  to  283,164  and  it  is  concluded 
that  the  driver-terrain  interaction  effect  cannot  be  properly 
assessed. 

Data  for  drivers  were  summarized  in  the  same  manner  as 
for  vehicles.  Differences  between  the  fastest  and  slowest 
drivers  within  a  square  ranged  from  5.5  to  11.2  minutes  except 
in  Terrain  III  Square  1  the  maximum  difference  was  25.8 
minutes.  The  mean  time  for  all  drivers  in  this  square  was 
30.2  minutes.  The  major  portion  of  the  difference  can  be 
attributed  to  driver  Number  3.  Overall  performance  of  this 
driver  was  poor,  as  shown  graphically  in  Figure  2. 

The  total  difference  in  elapsed  time,  over  all  terrains, 
between  experienced  and  novice  drivers  was  49.9  seconds,  or 
less  than  one  minute.  This  difference  was  not  significant  at 
the  five  percent  significance  level,  nor  even  at  the  20  percent 
significance  level.  It  can  be  safely  concluded  that  although 
differences  exist  among  drivers,  the  differences  are  between 
individual  drivers  and  not  the  subclassification  of  experienced 
and  novice  as  defined  for  this  experiment. 

BIAS  IN  TREATMENT  SUM  OF  SQUARES 

The  percent  upward  bias  of  treatment  sum  of  squares 
were  calculated  for  the  three  squares  having  missing  values. 

Results  are  shown  in  Table  VIII.  The  percent  bias  was  determined 
from  the  ratio  of  sum  of  squares  determined  by  supplying  estimated 
values,  to  the  unbiased  sum  of  squares  as  determined  by  the 
regression  analysis.  It  can  be  seen  that  the  bias  for  vehicles 
for  Terrain  III  square  1,  with  eleven  missing  values  was  over 
52  percent.  Actually,  all  conclusions  for  vehicles  were  the 
same  as  both  F-ratios  were  significant  at  less  than  the  0.01  level. 
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However,  the  21.4  percent  bias  did  modify  the  conclusions  for 
drivers  in  Terrain  III,  Square  2.  (The  F*ratio  obtained  using 
the  unbiased  sum  of  squares  was  not  significant  at  the  five 
percent  level  whereas  the  F-ratio  had  been  significant  when 
computed  from  the  biased  sum  of  squares). 

CONCLUSIONS 

1.  The  Graeco-Latin  square  design  for  this  experiment  did 
allow  the  separation  of  various  factors  so  that  relative  effects 
of  each  could  be  estimated. 

2.  The  linear  run  effect  within  squares  was  not  consistent 
throughout  the  experiment.  However,  there  was  some  evidence 
that  the  order  of  testing  may  sometimes  be  significant. 

3.  The  course  effect  was  large  throughout  the  experiment 
and  the  largest  overall  contributor  to  the  sum  of  squares.  This 
means  that  the  courses  within  a  terrain  were  not  homogeneous 
with  respect  to  time  required  for  a  vehicle  to  traverse  the 
courses . 

4.  Vehicle  effect  was  significant.  It  was  possible  to 
assess  relative  vehicle  performance  by  separating  vehicles  of 
similar  performance  into  different  groups. 

5.  Driver  effect  was  significant.  Individual  drivers 
could  be  separated  into  groups  of  similar  performance.  However, 
there  was  no  signifeant  difference  between  the  subclassifications 
of  experienced  and  novice  drivers  as  defined  for  this  experiment. 

6.  Interaction  between  vehicles  and  terrain  was  not 
significant. 

7.  Interaction  between  drivers  and  terrain  could  not  be 
properly  assessed. 

8.  A  pseudo  "learning"  effect  between  the  two  squares 
within  a  terrain  was  highly  significant.  The  cause  of  this  effect 
was  not  accurately  described. 

SUMMARY  AND  RECOMMENDATIONS 

The  number  of  possible  combinations  of  the  four  factors 
with  nine  levels  is  nine  to  the  fourth  power  or  6561.  Sincp 
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only  81  observations  were  taken  by  using  the  9X9  Graeco- 
Latin  square,  the  actual  test  is  equivalent  to  a  1/81  replicate. 
Results  of  these  tests  and  the  information  obtained  were 
considered  very  satisfactory.  This  type  of  test  appears  useful 
for  o’her  tests  involving  mobility  of  vehicles.  Specific  points 
for  consideration  are  as  follows: 

1.  There  was  some  evidence,  although  not  conclusive, 
that  trails  left  by  previous  runs  influenced  subsequent  runs. 

It  is  also  reasonable  to  expect  that  variations  in  the  weather 
and  other  environmental  conditions  would  affect  the  outcome  of 
a  test  run.  It  is  therefore  recommended  that  the  order  of 
testing  (runs)  be  built  into  the  design  for  future  tests  of  this 
nature . 

2.  The  design  must  allow  for  analysis  of  the  effects  of 
differences  in  courses  and  differences  in  drivers. 

3.  The  referee  effect  was  not  measured  during  these  tests. 
Ancillary  information  picked  up  during  these  tests  indicate 

the  referee  effect  may  be  significant.  In  a  future  experiment 
of  this  type  it  may  be  appropriate  to  superimpose  an  additional 
orthogonal  square  onto  the  two  orthogonal  squares  of  the  Graeco- 
Latin  design  to  assess  the  referee  effect,  i.e.,  add  another 
language  to  the  design. 

4.  Since  there  was  no  significant  vehicle-terrain  interaction 
effect,  the  size  of  most  future  experiments  could  be  reduced 

by  limiting  tests  to  one  terrain.  As  an  alternative,  courses 
may  be  laid  out  over  a  varying  type  of  terrain. 

5.  This  general  type  of  statistically  designed  vehicular 
mobility  test  may  be  extended  to  determine  differences  among 
features  of  vehicles.  Examples: 

(a)  Different  power  plants, transmissions ,  or  other 
components  in  the  same  vehicle. 

(b)  Effects  of  payload 

(c)  Tracked  versus  wheeled  vehicles  over  a  particular 
type  of  terrain. 
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Letters  ■  Vehicles 
Numerals  ■  Drivers 

Numerical  Data  ■  Time  to  traverse  course,  in  seconds. 

VEHICULAR  MOBILITY  TEST  DATA 
TERRAIN  III,  SQUARE  ?. 

TABLE  I 
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Minutes 

Average 

Vehicle 

Per 

Run 

Speed ,  MPH 

Terrain  I 

Square  1 

10.0  - 

46.9 

5.7  -  12.7 

Square  2 

8.0  - 

35.9 

5 . 8  -  14.9 

Terrain  II 

Square  1 

9.2  - 

46.6 

4.8  -  7.1 

Square  2 

7.7  - 

34.4 

5.5  -  7.8 

Terrain  III 

Square  1 

7.1  - 

74.1 

2.3  -  4.1 

Square  2 

5.1  - 

52.8 

2.7  -  5.2 

Average 
Miles  Driven 
Per  Course 


2.5  -  2.7 


1.5  -  2.5 


0.6  -  1.7 


VEHICLE  SPEED  AND  COURSE  DATA 

TABLE  II 
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) 

i 


! 

i 

i 

i. 

i 

/ 

i 


1-1* 

1-2 

1 1  - 1 

1 1  -  2 

1 1 1  - 1 

1 1 1  -  2 

F-RATIOS 

Runs  ** 

1.70 

0.65 

1.81 

0.94 

0.84 

1.04 

Courses 

4.90 

2.56 

4.38 

14.26 

11.39 

10.66 

Vehicles 

16.68 

18.25 

1.99 

5.13 

3.38 

2.40 

Drivers 

3.20 

2.78 

3.22 

10.35 

3.91 

2.07 

1 

SIGNIFICANCE 

LEVELS 

Runs  ** 

• 

- 

- 

- 

- 

■  - 

Courses 

.001 

.025 

.001 

.001 

.001 

.001 

Vehicles 

.001 

.001 

.100 

.001 

.005 

.050 

Drivers 

.010 

.0  25 

.010 

.001 

.005 

.100 

*  Example:  1-1  ■  Terrain  I,  Square  1 

**  Significance  Test  Not  Valid 


F-Ratios  and  Their  Respective  Significance  Levels 


TABLE  IV 
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BS23Z 


fERRAIN  III  TERRAIN  II  TERRAIN 


Square  1  Square  2 _ Combined 


DFAn 

D\ 

DFA 

> 

>AI1GB 

H 

BIGc' 

i  r 

GBI 

C 

(El) 

(El) 

(El) 

No 

DFH 

DFII 

Significant 

/A(E2) 

(E2) AGw 

Differences 

gb{ 

3b 

> 

I 

FHD(E2) B 

HF 

Not 

^Jag 

^  DI (E2 ) 

Combined 

ABG^ 

DISTINGUISHABLE  VEHICULAR  GROUPS 
AT  S  PERCENT  SIGNIFICANCE  LEVEL 


TABLE  V 
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A 

B 

n 

F 

G 

H 

I 

TER 

SQ 

1 

932.9 

1140.9 

901.5 

928.6 

1160.2 

1067.2 

1149.4 

#1 

SQ 

2 

843.7 

1070.9 

760.1 

766.8 

1025.6 

926.3 

1149.9 

TER 

SQ 

1 

1270.0 

1327.2 

975.1 

1025.6 

1260.4 

1090.7 

1S35.0 

#2 

SQ 

2 

1003.2 

1077.1 

855.8 

857.0 

1061.3 

908.6 

1288.7 

TER 

SQ 

1 

1816.7 

163S.9 

1578.9 

1372.4 

2013.5 

1416.3 

2416.2 

#3 

SQ 

2 

1383.1 

1384.2 

1043.0 

954.5 

1394.1 

917.2 

1060.0 

TABLE  VI:  TABLE  OF  MEANS  FOR  VEHICLES  OVER  TERRAINS 
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SOURCE 

d.f. 

M.S. 

F-Rat io 

Squares 

(6) 

Mean 

1 

58,919,874 

Terrain 

2 

827,562 

37.07 

[Sq  1]  vs  [Sq  2]* 

1 

940,056 

42.11 

[Sq  1]  minus  [Sq  2'J** 

2 

227,064 

10.17 

Vehicles 

6 

172,334 

7.72 

Vehicles  X  Terrain 

12 

20,788 

0.93 

Remainder  *** 

(18) 

22,325 

j.  Vehicle  X  Square 

6 

18,641 

Veh  X  Ter  X  Sq 

12 

24,167 

Total 

42 

*Over  three  terrains 
**Within  Terrains 

**  For  estimate  of  experimental  error 


ANOVA  For  Vehicles  Over  Terrains 


TABLE  VII 
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Runs  ♦ 


Veh  i  cl 


Driver 


Terrain  I 
Square  1 

(1  Missing  Value) 
Courses  2.4 


0.8 


2.2 


Terrain  III 
Square  1 

(11  Missing  Values)  ( 

12.0 


52.2 


47.0 


UPWARD  BIAS  IN  TREATMENT  SUM  OF  SQUARES 
WHEN  MISSING  VALUER  WERE  ESTIMATED 


Terrain  III 
Square  2 
Missing  Values) 

2.0 


9.1 


21.4 


TABLE  VIII 


PERFORMANCE  OF  DRIVERS 
OVER  THREE  TERRAINS 


Figure  2 


PROBABILITY  OF  A  NON-REPEATABLE  OBSERVATION  -  AN 


EXAMINATION  OF  THE  UTILITY  CONCEPT  AND  THE 
NATURE  OF  QUEUEING  SEQUENCES 


Mikiso  Mlzuki 

Federal  Electric  Corporation/ITT 
Vandenberg  AF  Base,  California 


0.  INTRODUCTION .  The  subjective  probability  is  often  defined 
using  the  utility  concept  of  gambles  and  lotteries,  cf.,  de  Finettl 
[2]  and  Savage  [8].  Such  an  approach  gives  the  only  tangible  means 
of  measuring  the  personal  assessment  of  subjective  probabilities. 

However,  the  basis  for  this  approach  seems  to  be  the  unstated  premise 
that  the  gambles  are  to  be  played  or  can  be  played  repeatedly.  The 
expected  utility  or  the  weighted  mean  of  gains  with  the  weighting  of 
the  probabilities  of  particular  outcomes  has  a  clearly  defined  meaning 
under  such  conditions.  On  the  contrary,  the  same  weighted  mean  does 
not  possess  any  practical  meaning  for  a  non-repeatable  observation. 

Fishburn  [3]  concedes  that  in  order  to  define  subjective  probability 
coherently  using  the  utility  concept,  it  is  essential  to  have  con¬ 
sequences  that  can  occur  under  more  than  one  state.  This  indicates 
the  possibility  of  modifying  the  utility  theory  for  non-repetitive 
random  events. 

As  the  second  topic  of  this  paper,  the  nature  of  queueing  sequences 
is  investigated  from  the  same  point  of  view.  The  queueing  sequences 
constitute  non-repeatable  observations  for  each  particular  service 
system.  An  observable  queue  size  sequence  is  dependent  on  its  companion 
sequence  of  arrival/service  events.  By  the  above  argument,  the  prob¬ 
ability  discussed  in  queueing  models  of  a  particular  system  cannot  be 
interpreted  as  subjective  probability.  An  Investigation  on  the  characteris¬ 
tics  of  ensembles  of  queueing  sequences  is  made. 

1.  UTILITY  THEORY  AND  SUBJECTIVE  PROBABILITY.  The  utility  theory 
is  constructed  using  a  mixture  space,  for  instance  as  defined  in  [3].  A 
mixture  set  consists  of  a  set  n  *  {A,B,C,...}  and  operation  oA  +  (l-a)B 
which  define  an  associating  element  of  R  with  each  a  <[0,1]  and  each 
ordered  pair  (A,B)  €  r  2  such  that,  if  A,B  £  R  ,  fit  ,  &€■  [0,1],  then 


1A  +  OB  -  A 

aA  +  (l-a)B  ■  (l-u)B  +  aA 
a [BA  +  (l-S)B]  +  (l-a)B  - 


(1.1) 

(1.2) 

(1.3) 
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ctBA  +  (l-aB)B. 


In  repeatedly  played  gambles,  the  expression  aA  +  (l-a)B  corresponds 
to  the  gain  (or  loss)  of  mixed  outcomes  of  A's  in  100 &  of  plays  and 
of  B's  in  100(l-a)%  of  plays.  In  particular,  if  the  gain  A  is  set 
equal  to  unity  and  the  gain  B  is  set  to  zero,  the  utility  of  the  mix¬ 
ture  of  A's  and  B's;  namely,  a‘l  +  (1-a)  *0  -  a,  represents  the  sub¬ 
jective  probability  that  A  occurs.  The  generalization  of  this  gambling 
situation  to  non-repetitive  random  events  requires  the  substitution  of 
the  uncertainty  of  a  single  random  outcome  by  an  aggregate  of  random 
observations . 

Some  of  the  difficulties  are  typified  by  the  examples  of  non-constant 
valued  consequences.  For  Instance,  the  utility  in  the  sense  of  social 
justice  of  a  judge's  sentence  varies  depending  on  his  choice  of  act  of 
taking  the  side  that  the  accused  did  or  did  not  commit  the  crime; 
Fishburn,  loc.  cit.  In  the  risk  taking  acts  of  Russian  roulette  and 
dangerous  mountain  climbing,  the  mental  elation,  if  survives,  after  the 
acts  gives  a  different  value  of  being  alive  from  that  of  not  taking  the 
chances.  Under  such  conditions,  the  linear  combinations  of  utilities 
of  consequences  do  not  have  any  meaning.  And,  this  is  the  basis  that 
Fishburn  made  the  statement  that  subjective  probability  cannot  be  dis¬ 
cussed  for  such  cases. 

The  probability  assigned  to  a  non-repeatable  observation  la  best 
formulated  as  a  set  of  real  numbers  distributed  over  an  exhaustive  set 
of  mutually  exclusive  possible  outcomes.  Denote  the  possible  outcomes 
by1  A^,  i  ■  l,...,n,  and  the  real  numbers  assigned  to  by  P^), 

satisfying  P(A^)  ^  0  and  £  P(A^)  -  1.  Suppose  a  gain  of  A^  is  made 

when  A^  is  observed,  where  all  the  gains  may  be  bounded.  Then,  if  A^ 

is  observed,  no  other  A^'s  (j»*i)  can  add  to  the  gain  after  observing  A^ 

Because  of  this,  there  exist  no  logical  bases  for  associating  a  gain  of 
A^  with  those  of  A^ 'a  in  the  form  of  the  expected  utility,  £  (gain  of 

A±)  P(A1).'  1 

2.  EXPECTATION  AND  EXPECTED  UTILITY.  Define  a  variable  which 
takes  on  x.  when  A4  is  observed,  and  define  the  indicator  function 

X  4i 

{1  if  the  observed  outcome  is  A^, 

0  otherwise. 

Then,  the  simple  random  variable  X  is  given  by,  cf.  Lofeve  [4], 


(3) 


X  - 
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The  exnect uti nn  nf  Y  -Ia  Ky 


(4)  K(X)  -  l  x  P(A  ). 

i  1 

Suppose  the  utility  of  the  constant  consequences  are  represented  by 

when  the  state  obtains .  Then,  (4)  is  the  expected  utility  of  the 

outcomes.  However,  as  mentioned  earlier,  the  same  expression  is  not 
adequate  for  the  representation  of  the  utility  of  a  non-repea table  and 
non-constant  valued  consequence.  In  order  to  circumvent  this  difficulty, 
Mizuki  [6]  suggested  an  alternative  definition  of  expectation  for  a  non- 
repeatable  observation  of  the  form 


(5) 


ENR(X) 


(PAt) 


*i  «V  \ 


i“l,...,n,  yielding  n  different  expected  values  of  each  possible  outcome 
x..  The  ENR  expectation  introduced  here  is  consistent  with  Bayes'  defini¬ 
tion  of  probability  [1]  of  any  event  to  be  the  ratio  between  the  value  at 
which  an  expectation  depending  on  the  happening  of  the  event  ought  to  be 
computed  and  the  value  of  the  thing  expected  upon  its  happening.  The  use 
of  (5)  leads  to  an  interesting  modification  of  the  utility  theory  for  a 
non-repeatable  event. 


The  above  formulation  is  slightly  generalized.  Suppose  there  exist 
a  chosen  act,  denoted  by  H,  and  n  mutually  exclusive  states,  A ^ ,  j»l . . 

and  consequences,  u^  ,  measured  in  some  utility,  when  H  is  chosen  and  A^ 

obtains.  The  probability  that  A.  obtains  when  H  is  chosen  is  defined  by 

real  numbers  P„(A.)  ,  satisfying  Pu(A.)_0  and  ypu(A.)  *  1.  In  order  to 
tij  nj  h^H 

account  for  the  non-constant  values  of  consequences,  u?  is  not  necessarily 

U»  J 

equal  to  u^  for  H*  H.  The  familiar  use  of  mixed  acts  is  not  justified 

for  a  non-repeatable  situation  and  will  be  excluded  from  the  subsequent 
development.  The  connotation  is  that  in  spite  of  the  mixing  operations 
prior  to  the  final  choice  of  act,  the  chosen  act  is  unique,  thus  losing 
all  of  its  random  attributes  unlike  the  case  of  repeatable  events.  This 
eliminates  the  necessity  of  defining  the  probability  P(H)  assigned  over 
different  choice  of  H's.  Under  this  set  of  conditions,  a  simple  random 
variable  of  utility  Ujj  of  a  chosen  act  H  is  defined  by 


(6) 


n 

I 

J-l 


H 
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►  r*  4  iro  n  V111 

O—  **■“  *"4 


ENR(Uh)  -  u"  P^) 


* J“1 » * 


For  a  choice  problem  of  a  non-repeat able  event,  the  expectation  given 
in  (7)  can  be  used  aa  the  optimizing  criterion. 

3.  A  CRITERION  FOR  PREFERENCE.  The  individual  expectation 

uj  ^  ®ay  **e  interpreted  as  the  psychological  incentive 

force  acting  on  a  lever  at  the  point  of  distance  u^  from  the  fulcrum 

with  the  mass  P^(A^),  whereas  the  incentive  force  should  be  measured 

at  a  fixed  point  on  the  lever  always.  Since  there  exists  only  one 
outcome  event  A  ^ ,  the  Incentive  forces  can  act  only  individually,  but 

not  collectively,  for  any  given  decision  problem.  Application  of  such 
an  interpretation  is  considered  below. 

Savage  discusses  an  example  on  the  choice  between  two  pairs  of 
gambles,  pp.  101-103  of  [8].  Savage  prefers  Gamble  1  to  2,  and  Gamble 
3  to  4  after  reversing  his  initial  intuitive  choice  of  Gamble  4  over  3 
by  applying  the  sure-thing  principle.  However,  the  utility  theory  being 
developed  simply  as  a  normative  theory,  it  is  natural  to  seek  an  augmented 
normative  theory  which  explains  his  initial  Intuitive  choice.  The 
specifications  of  Savage's  gambles  are  as  follows.  For  the  choice 
between  Gambles  1  and  2, 

Gamble  1:  $500,000  with  probability  1;  and 

Gamble  2:  $2,500,000  with  probability  0.1, 

$500,000  with  probability  0.89, 

$0  with  probability  0.01. 

Similarly,  for  Gambles  3  and  4, 

Gamble  3:  $500,000  with  probability  0.11, 

$0  with  probability  0.89;  and, 

Gamble  4:  $2,500,000  with  probability  0.1, 

$0  with  probability  0.9. 


For  the  sake  of  simplicity,  suppose  one  acts  based  on  a  linear  utility 
function  over  the  range  of  zero  to  $2,500,000  of  the  form  u(x)  ■  kx, 
k  >0  for  x  dollars  gain.  Using  (7),  it  is  immediately  seen  that  the 
expected  utility  term  of  $500,000  of  Gamble  1  is  greater  than  any  of 
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die  three  expected  uMl-ffy  terms  of  $250, CCC,  0445, GOG  ami  $0  or  Gamble  l . 
Likewise,  the  combination  of  the  expected  utility  terms  of  $250,000  and  $0 
of  Gamble  4  is  more  attractive  than  the  $55,000  and  $0  combination  of 
Gamble  3.  A  similar  preference  pattern  is  obtained  even  in  the  general 
case  of  usual  concave  utility  functions.  When  the  utility  function  be¬ 
comes  sharply  convex,  an  individual  inclines  to  prefer  Gamble  2  to  1, 
and  at  the  same  time,  he  remains  to  prefer  Gamble  4  to  3.  This  is  a 
clear  indication  that  the  leverage  system  model  can  explain  the  general 
intuitive  choice  patterns. 

The  preference  rule  examined  above  can  be  summarized  by: 

Dominance  of  Expectations:  Act  H  is  preferred  to  act  G,  if 

H  G 

u  P  (A  )  >  u.  P  (A.)  for  corresponding  A.'s. 

J  “  J  J  G  J  J 

This  is  a  partitioned  version  of  the  familiar  Bayes'  principle  which 
maximizes  the  expected  utility  (or  utilities  in  this  case).  The  other 
familiar  rules  of  dominance  principle,  minimax  regret,  and  maxmin 
principles  remain  unchanged  for  such  non-repeatable  events.  For  the 
details  of  this  development,  the  readers  are  referred  to  [7]. 

4.  ABOUT  QUEUEING  SEQUENCES.  A  queueing  model  is  specified  by 
the  input  process,  service  time  distribution,  and  the  number  of  servers. 
The  most  elementary  example  is  that  of  Poisson  arrivals  (M)  and  negative 
exponential  distribution  (M)  of  service  times  with  a  single  server  (1) , 
or  M/M/1  system,  which  will  be  examined  in  the  following. 

For  a  particular  system,  a  pair  of  sequences  of  customers'  arrival 
and  departure  times ,  or  equivalently  a  pair  of  sequences  of  queue  Bizes 
and  arrival/service  events  can  be  observed.  In  the  latter  pair,  the 
queue  size  sequence  is  functionally  dependent  on  the  observed  sequence  of 
arrival/service  events.  These  sequences  are  random  in  nature  prior  to  the 
observation,  but  are  unique  and  fixed  when  it  is  observed.  In  other 
words,  these  sequences  constitute  a  pair  of  non-repeatable  observations 
from  an  ensemble,  of  such  pairs.  A  subjective  probability  may  be  used 
to  describe  the  uncertainties  of  such  samplings.  However,  there  exists 
a  complete  analogy  with  the  utility  of  non-constant  valued  consequences 
of  non-repeatable  event  of  Section  2.  If  we  uBe  the  Flshburn's  example 
of  a  judge's  sentence,  the  arrival/service  events  sequence  corresponds 
to  the  judge's  taking  the  side  that  the  accused  did  or  did  not  commit 
the  crime,  and  the  queue  size  sequence  corresponds  to  the  social  justice. 
ThiB  puts  the  problem  right  back  to  the  start. 

The  M/M/1  models  are  often  analyzed  using  the  birth-and-death  process 

models.  Consider  a  simple  birth-and-death  process  of  Poisson  input  with 

a  constant  parameter  \  and  a  negative  exponential  service  time  with  a 

constant  parameter  u.  By  denoting  the  probability  that  the  queue  size 

is  n  at  time  t  by  P  (t) ,  the  standard  differential  difference  equations 
n 

are  introduced,  i.e., 
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(8) 


for  n>  0 


P^(t)  -  -(X  +  w)Pn(t)  +  XP^Ct)  +  pPn+1(t) 
P^ft)  -  -  XPQ(t)  +  uP1(t)  . 

The  original  balancing  equation  ia  given  by 


(9)  Pn(t+At)  -  (1-XA  t-  uAt)Pn(t)  +  XAtPn_1(t)  +  yAtPn+1(t) 


il 

I 


for  n>  0.  Notice  that  there  exist  two  classes  of  probabilities  in  (9), 
namely,  one  class  of  XAt,  yAt,  and  (1-  AAt-  yAt) ,  and  the  other  of  PR(t) . 

The  former  designates  the  probability  of  arrival/service  events,  and  the 
latter  designates  the  probability  of  queue  sizes.  The  queue  size  of  a 
particular  M/M/1  system  is,  however,  by  definition  a  step  function  in 
time.  If  the  queue  size  at  time  t,  denoted  by  q(t),  is  known,  for 
suitably  small  At, 


(10) 


q(t+  At) 


q(t)  with  probability  1-XAt-pAt, 

(  q(t)-l  with  probability  pAt, 

q(t)+l  with  probability  XAt. 


In  fact,  q(t)  may  not  be  known  unless  it  is  observed,  but  q(t)  is  not  a 
random  variable.  Rather,  q(t)  is  an  observation  which  is  a  constant;  and, 
furthermore,  q(t)  cannot  be  observed  repeatedly  for  any  given  t.  Thus, 
q(t)  is  a  single  non-repeat able  observation.  Equation  (10)  defines  that 
q(t)  is  a  function  dependent  on  another  non-repeatable  observation  over 
At  of  a  new  arrival,  a  departure,  or  no  events. 


In  the  original  formulation  of  the  birth-and-death  process  models, 

P  (t)  is  defined  as  the  proportion  of  n  items  in  existence  at  time  t  with 
n 

respect  to  a  set  of  simultaneously  observable  ensembles,  such  as  bacterial 
cultures,  and  particles  in  chambers.  Our  primary  interest  in  the  behavior 
of  a  particular  queueing  system  differs  from  these  cases,  and  PR(t)  is  a 

representation  of  the  uncertainty  for  the  value  q(t)  prior  to  its  non- 
repeatable  observation.  Since  q(t)  is  known  to  be  unique  at  t,  it  is 
sensible  to  construct  a  parametric  model  shown  below: 

Define  Q(N) ,  Q(A) ,  and  Q(L)  to  be  three  matrices  satisfying 
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(11) 


Q(N) 

Q(A) 


Q(L) 


<V 


<ao.j) 


(6i-l,J} 


i,j«0,l,2,... 

1=0, 1,2, . . .  ;  j-1,2,... 

j -0,1,2,... 

1»1 ,2 ,3, . .  • ;  j-0,1,2,... 


At  time  t  the  queue  size  of  a  particular  M/M/1  system  Is  given  by  a 
vector  q(t)  -  (qQ(t),  Ct > »  q2(fc)  *  •••).  q^t)  -  1  for  some  1,  q^  (t>  -  0 

for  1  j4  j.  The  queue  size  at  t  +  At  is  then  given  by  q(t)Q(x),  x  *■  N,A,L, 
which  will  occur  with  the  probability  II  (x)  respectively  such  that 


II  (N)  -  1  -  (X  +  u)  At 
(12)  11(A)  -  A  At 

II  (L)  «  pAt. 

In  this  formulation  Q(x)  is  a  random  matrix  which  takes  on  Q(N),  Q(A)  , 
or  Q(L)  with  the  probability  of  II (N),  11(A),  or  II (L). 

The  two  different  notions  of  expectations  of  (4)  and  (5)  can  be 
applied  to  the  above  argument.  Let  and  P(Ai)  correspond  to  Q(x)  and 

II (x) ,  respectively.  Then,  we  can  define  a  simple  random  matrix 


(13)  Q  -  l  Q(x)  I 

X 

where  x  *  N,A,L.  Then,  from  (4)  we  obtain 


E(Q)  -  l  Q(x)  II (x) 
x 

(14) 


-  (1-  AAt-  pAt)Q(N)  +  XAtQ(A)  +  pAtQ(L)  . 

Consider  some  arbitrary  ensemble  of  q(t)'s,  and  define  the  expectation 
E(q(t))  -  p(t)  over  this  ensemble  to  be  a  probability  vector  such  that 
p(t)  -  (pi(t)),  i  -  0,1,2,...,  0  <pi(t)  £l,  Ipt(t)  -  1.  Define  the 

entry  of  p(t)Q(N)  for  queue  size  n  to  be  ( t )  ,  of  p(t)Q(A)  to  be  pn«^(t)» 

and  of  p(t)Q(L)  to  be  Pn+1  ( t ) .  Then,  the  entry  of  E(p(t)Q(x))  for  queue 

size  n  is  given  by  the  Equation  (9)  of  the  birth-and-death  process  model. 
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On  the  other  hand,  the  use  of  (5)  obtains 


ENR(Q)  -  l  Q(x)  II (x)  I 

* 


(15) 


((l-XAt-pAt)Q(N)  ,  if  x  -  N, 


-  / XAtQ(A) 


if  x  ■  A, 


^yAtQ(L)  ,  if  x  ■  L. 

This  definition  of  ENR(Q)  satisfied  a  one-to-one  correspondence  with 
(10)  except  for  the  fact  II(x)Q(x)  is  given  instead  of  Q(x)  with  its 
associated  II (x) . 


Another  queueing  model  of  Poisson  arrivals  (M)  and  general  service 
time  distribution  (G)  of  a  single  server  (1) ,  or  M/G/l  system  can  also 
be  analyzed  using  the  approach  of  queueing  sequences.  It  can  be  shown 
that  the  convergence  properties  defined  for  the  overall  ensemble  of 
queueing  sequences  do  not  hold  for  the  conditional  subensemble  of  M/G/l 
sequences,  cf.  [5]. 
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ABSTRACT.  A  method  is  presented  for  calculating  the  probability 
of  killing  a  multiple  target  aircraft  formation  attacking  a  missile 
battery  as  a  function  of  engagement  parameters  and  missile  firing 
strategy.  The  stochastic  processes  engendered  by  various  firing 
strategies  are  represented  by  signal  flow  graphs,  facilitating  the 
calculations.  Results  are  utilized  to  optimize  missile  firing 
strategy.  Although  developed  for  analysis  of  firing  strategies,  the 
method  can  be  applied  to  many  analogous  problems  involving  stochastic 
duels  and  programming  under  conditions  of  uncertainty,  where  the 
situation  can  be  resolved  into  discrete  states  with  transition 
probabilities  dependent  on  both  the  state  and  the  path  by  which  it 
was  reached. 

INTRODUCTION.  When  an  air  defense  system  using  missiles,  which 
home  on  energy  furnished  by  an  illuminating  radar  and  reflected  by 
the  targets,  attempts  to  engage  a  formation  of  aircraft  (or  missiles) 
which  are  grouped  closely  enough  in  position  and  velocity  that  they 
appear  as  a  single  target  to  the  homing  missiles  until  the  latter  are 
close  to  the  formation,  a  question  arises  as  to  the  optimal  firing 
strategy.  The  choice  of  a  strategy  for  any  particular  situation 
depends  on  several  factors  which  affect  the  conditional  probability 
of  success  at  any  particular  point  in  the  process  and  which  must  be 
accounted  for  in  formulating  a  generalized  framework  for  assessing 
various  strategies.  When  a  missile  engages  the  formation,  it 
initially  homes  on  the  centroid  of  reflected  energy.  At  some  point, 
the  return  from  a  single  target  will  override  the  centroid,  and  the 
missile  may  have  to  perform  a  relatively  high-g  maneuver  in  the  end 
game,  degrading  its  kill  probability.  The  effects  are  worst  for  the 
case  of  two  targets,  where  the  energy  centroid  may  move  back  and  forth 
rapidly,  and  become  less  detrimental  as  the  number  of  targets  increases, 
since  the  energy  centroid  tends  to  remain  closer  to  the  center  of  the 
formation  in  this  case.  Therefore,  in  analyzing  the  effectiveness  of 
various  missile  firing  strategies,  it  is  necessary  to  assign  a  weighting 
factor  to  the  single  shot  kill  probability  (or  SSKP)  in  accordance  with 
the  number  of  targets  in  the  formation.  Since  the  magnitude  of  the 
weighting  factor  increases  as  the  number  of  targets  increases,  it  might 
seem  advantageous  to  fire  as  many  missiles  as  possible  in  the  first  volley. 
However,  as  the  number  of  simultaneously  fired  missiles  is  increased,  the 
probability  of  two  or  more  missiles  locking  on  the  same  target  increases, 
and  at  some  point  a  further  increase  becomes  unattractive. 
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In  the  following  method  of  analysis,  the  number  of  attacking 
aircraft  is  taken  as  k,  and  the  stochastic  process  of  shooting  them 
down  is  represented  by  a  system  having  k  states ,  the  number  of  each 
state  denoting  the  number  of  aircraft  which  have  been  killed  at  that 
point  in  the  process.  The  system  is  depicted  by  a  single  flow  graph 
for  each  firing  strategy.  The  paths  leaving  each  node  represent  all 
possible  waya  to  go  from  each  state  to  succeeding  Btates ,  each  path 
value  being  the  conditional  probability  of  reaching  state  n+p  via  that 
path  given  that  state  n  has  been  reached.  In  order  to  illustrate  how 
this  technique  is  used  to  determine  an  optimal  firing  strategy,  the 
number  of  targets  is  taken  as  four,  and  the  following  four  strategies, 
in  which  Sn^,  ^....n^,  refers  to  n^  missiles  fired  in  the  first  volley, 

nj  the  second  time,  etc.,  and  n^  the  jth  and  all  succeeding  times,  are 

analyzed,  using  the  signal  flow  diagrams  depicted  in  Figure  1  through  4 
in  conjunction  with  Mason's  signal  flow  graph  rule  to  effect  the 
calculations . 


MISSILE  FIRING  STRATEGIES 

51,  1,  1,  ... 

52,  2,  2,  ... 

53,  3,  3,  ... 

S3,  2,  1,  ... 

SIGNAL  FLOW  GRAPHS  FOR  UNIFORM  STRATEGIES.  For  strategy  SI,  1,  1, 

. . . ,  the  engagement  process  is  represented  by  Figure  1  in  a  manner 
suggested  by  Hall  [1].  The  four  states  are  represented  by  nodes  1  to 
4,  each  state  representing  the  number  of  planes  which  have  been  shot 
down  at  that  point  in  the  process.  Each  firing  of  a  missile  is  a 
Bernoulli  trial  with  the  probability  of  success,  equal  to  the  product 
of  the  single  shot  kill  probability  and  the  multiple  target  weighting 
factor  for  that  state,  determining  the  value  of  the  path  to  the  next 
state,  and  the  probability  of  failure  determining  the  path  value  of 
the  self-loop  to  the  same  state.  Path  values  are  multiplied  by  a 
dimensionless  parameter  x.  Since  the  system  function  or  ratio  of 
output  to  input,  from  the  input  to  a  specific  node  is  a  multiplicative 
function  of  the  node-to-node  path  values ,  the  exponent  of  x  in  the 
calculated  system  function,  or  gain,  to  that  node  is  equal  to  the 
umber  of  missiles  fired  to  reach  the  state  represented  by  that  node 
via  that  path.  The  self-loop  in  state  four  is  necessary  to  account 
for  any  missiles  fired  or  still  in  transit  after  all  targets  are 
killed.  It  is  seen  that  the  engagement  sequence  in  this  case  is  a 
Markov  chain  with  as  many  states  as  there  are  aircraft,  each  state 
representing  the  number  of  aircraft  which  have  been  killed.  Although, 
in  this  simple  case,  it  is  feasible  to  solve  the  problem  using  transition 
matrices,  it  will  be  seen  later  that  this  technique  will  become  increasingly 
tedious  for  more  complicated  strategies.  For  instance,  a  "non-uniform" 
strategy,  where  successive  volleys  may  contain  different  numbers  of 
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missiles,  constitutes  a  system  with  memory,  in  which  the  conditional 
probability  of  transition  to  the  next  state  depends  not  only  on  the 
present  state,  but  also  on  how  one  arrived  in  it;  l.e.,  how  many 
missiles  were  fired  in  reaching  the  present  state.  The  stochastic 
process  then  ceases  to  be  represented  by  a  Markov  chain,  and  the  flow 
graph  becomes  very  useful  as  an  aid  both  in  calculation  and  in  under¬ 
standing  the  physical  implications  of  the  situation.  The  actual 
calculations  are  carried  out  using  Mason's  gain  formula  [2,  3]. 

In  the  signal  flow  diagram,  the  value  at  each  node  is  equal  to 
the  sum  of  the  values  of  all  paths  leading  to  that  node.  Each  path 
value  is  the  product  of  the  value  of  the  node  at  the  beginning  of  the 
path  and  the  transfer  function  associated  with  that  path.  Signal  flow 
diagrams  find  their  greatest  application  in  electrical  engineering  in 
connection  with  differential  equations,  representing  control  systems, 
which  are  first  Laplace-transformed,  then  depicted  as  flow  diagrams, 
solved  using  Mason's  rule  as  described  below,  and  then  transformed 
back  to  the  time  domain.  In  the  present  case,  the  "input  signal"  is 
simply  unity  probability  of  reaching  state  0;  i.e.,  of  shooting  down 
at  least  zero  aircraft.  The  nodes  represent  states  which  are  defined 
by  the  number  of  aircraft  which  have  been  shot  down,  state  k  representing 
a  point  in  the  process  at  which  k  aircraft  have  been  shot  down.  The 
"transfer  functions"  are  simply  the  conditional  probability  of  reaching 
a  certain  state,  or  number  of  aircraft  downed,  given  that  a  certain  other 
state  had  been  reached  previously.  In  order  to  represent  these  probabilities 
as  functions  of  the  number  of  missiles  being  shot,  the  conditional  prob¬ 
abilities  are  multiplied  by  x**,  where  n  is  the  number  of  missiles  which 
are  shot  in  each  volley  when  attempting  to  go  from  a  node  to  a  succeeding 
node.  As  will  be  seen  below,  when  Mason's  rule  is  used  to  find  the  out¬ 
put  signal,  given  the  input  signal  and  the  signal  flow  diagram,  the  path 
values  between  successive  nodes  are  multiplied.  Therefore,  the  highest 
exponent  of  x  in  the  system  function,  or  ratio  of  output,  represents  the 
total  number  of  missiles  fired  to  reach  the  final  node,  or  number  of  air¬ 
craft  downed,  since  it  was  arrived  at  by  traversing  a  series  path  from 
node  to  node,  with  the  path  values  multiplying  and  therefore  with  the 
exponents  of  x  in  each  path  adding.  Thus,  considering  Figure  1,  it  is 
obvious  that  the  probability  of  shooting  down  four  planes;  l.e.,  of 
reaching  Node  IV,  by  firing  only  four  missiles  is 


3 

P(IV,  4)  -  ir  P 
i-0 


This  is  true  since,  in  order  to  down  four  aircraft  with  four  missiles,  one 
must  traverse  the  paths  representing  the  conditional  probability  of  reaching 
the  next  state  (getting  a  hit)  directly  from  Node  0  to  Node  IV  without 
traversing  any  self-loops,  which  represent  the  conditional  probability 
of  remaining  in  the  same  state  (getting  a  miss).  It  is  seen  that  by 
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multiplying  eacn  by  x,  the  system  function,  oi  gain,  fui  ictn-’uiuj 
node  IV  by  firing  only  four  missiles  is 

3  3 

G(IV,  4)  -  it  P.  x  -  x^  ir  P. 

i-0  1  i-0  1 

The  exponent  of  x  is  seen  to  represent  the  number  of  missiles  fired. 

If  one  was  to  miss  with  the  jth  shot,  however,  it  would  take  five 
missiles  to  shoot  down  the  four  aircraft,  and  the  probability  would 
be 

3 

P(IV,  5,  miss  jth  shot)  -  q  ir  P 

2  i-0  1 

1+) 

In  this  case  the  system  function  arrived  at  by  multiplying  each  P^  by 
x,  would  be 

3  3 

G(IV ,  5,  miss  jth  shot)  -  q.  x  ir  P.  x  -  x  q.  it  P. 

J  1«0  1  1  i-0  l 

i^j  i^J 

Of  course,  the  total  probability  of  shooting  down  four  aircraft  by  firing 
five  missiles  is  the  sum  of  five  such  probabilities,  arrived  at  by  con¬ 
sidering  a  miss  on  the  jth  shot  and  letting  j  range  from  0  to  5.  However, 

the  system  gain  will  still  contain  an  x^  term.  As  will  be  seen  below,  use 
of  Mason's  rile  in  conjunction  with  a  particular  diagram  will  produce  a 
polynomial  in  x  in  which  the  coefficient  of  x  in  each  term  will  Indicate 
the  probability  of  shooting  down  all  the  aircraft,  using  the  strategy 
associated  with  that  diagram,  by  firing  the  number  of  missiles  Indicated 
by  the  exponent  of  x  in  that  term.  The  calculations  may  be  carried  out 

to  any  desired  power  of  x  (number  of  missiles  fired)  and  the  probability 

of  shooting  down  the  aircraft  approaches  unity  as  the  number  of  missiles 
is  increased  without  limit.  If  it  were  desired  to  find  the  probability 
of  reaching  a  lesser  state,  say,  state  k  (k  aircraft  downed),  then  the 
signal  flow  graph  could  be  used  by  omitting  all  paths  which  lead  to 
higher  nodes  than  Node  k. 

Mason's  signal  flow  graph  gain  formula  is  a  technique  for  utilizing 
a  signal  flow  graph  to  obtain  the  gain  of  the  system  instead  of  directly 
solving  the  equations  describing  the  system.  It  makes  use  of  the  gains, 
or  transfer  functions,  associated  with  forward  paths  and  loops,  the  gain 

of  a  forward  path  being  the  product  of  the  gains  of  each  segment  of  the 

path,  where  each  segment  leads  from  one  node  to  another.  A  loop  is 
simply  a  forward  path  which  closes  on  Itself .  The  formula  is : 
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G. 

K 


L 

K 


I 

k 


A 


where 


G  ■  system  gain,  or  ratio  of  output  to  input 

G1f  *  gain  of  the  kth  forward  path 

A  -  system  determinant 

-1  -  (sum  of  all  individual  loop  gains) 

+  (sum  of  products  of  gains  of  all  possible  combinations 
of  two  non-touching  loops)  -  (sum  of  products  of  gains 
of  all  possible  combinations  of  three  non-touching  loops) 

+  -  ... 

^  -  value  of  A  for  that  part  of  graph  not  touching  the  kth 
forward  path 

While  the  mechanics  of  the  formula  are  simple,  they  frequently 
are  tedious  for  a  signal  flow  diagram  which  has  many  loops  and  forward 
paths.  Fortunately,  this  type  of  repetitive  calculation  is  easily 
carried  out  with  a  digital  computer;  one  only  needs  to  Identify  the 
individual  forward  paths  and  their  respective  non-touching  loops  on  a 
particular  graph  in  order  to  be  able  to  use  a  standard  program, 

For  the  strategy  of  firing  successive  volleys  of  two  missiles, 
S2,  2,  2,  ...,  depicted  by  Figure  2,  it  is  seen  that  several  results 
may  ensue  from  the  firing"  of  a  volley.  First,  one  may  score  two  hits, 
not  on  the  same  target,  and  will,  therefore,  go  from  state  n  to  state 
n+2.  Secondly,  one  may  score  only  one  hit,  and  will,  therefore,  reach 
state  n+1.  Thirdly,  one  may  score  two  hits,  both  on  the  same  target, 
and  will,  therefore,  reach  state  n-fl  by  a  different  path.  Fourthly, 
one  may  score  no  hits  and  remain  in  state  n.  In  order  to  assign  the 
correct  value  to  each  path,  it  is  necessary  to  know  the  probability  of 
l  missiles  homing  on  the  same  target  when  each  of  m  missiles  homes  on 
one  of  n  targets.  This  will  be 
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The  effects  of  these  probabilities  are  seen  in  the  signal  flow  graph. 
The  technique  of  forming  the  graph  is  straightforward;  all  possible 
transitions  from  one  state  to  the  next  are  given  a  path,  which  is 
assigned  the  appropriate  probability  and  multiplied  by  x  ,  since  two 
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missiles  are  being  fired.  The  binomial  coefficients  are  also  necessary,, 
since  the  paths  result  from  Bernoulli  trials  and  follow  the  binomial 
probability  law.  The  values  of  all  paths  leaving  each  state  node  will, 
of  course,  sum  to  unity;  if  they  do  not,  a  mistake  has  been  made.  Iu 
is  seen  that  the  process  is  still  a  Markov  chain,  since  two  missiles 
will  be  fired  with  the  same  probability  of  success  when  the  system  is 
in  a  particular  state,  regardless  of  how  that  state  was  reached.  The 
extension  to  strategy  S3,  3,  3,  . . .  ,  is  straightforward  and  results  in 
the  flow  graph  in  Figure  3. 

FLOW  GRAPHS  FOR  NON-UNIFORM  STRATEGIES.  When  one  uses  a  non- 
uniform  strategy  such  as  S3,  2,  1,  1,  ....  the  system  becomes  somewhat 
more  complicated,  as  seen  in  Figure  4.  The  number  of  missiles  to  be 
fired  when  the  system  is  in  a  particular  state  now  depends  on  how  many 
were  fired  in  reaching  it.  Therefore,  the  transition  probabilities  are 
dependent  not  only  on  the  present  state,  but  also  on  how  that  state  was 
reached.  The  system  has  now  developed  a  memory  and  can  no  longer  be 
represented  by  a  Markov  chain.  Fortunately,  the  flow  graph  remains 
quite  simple  even  for  this  type  of  stochastic  process.  Node  N, 
representing  the  tate  in  which  n  targets  have  been  killed,  is 
merely  split  into  tr  nodes,  where  each  node  is  reached  from  a  prior 
node  either  by  firing  a  volley  composed  of  a  different  total  number 
of  missiles,  m  being  the  number  of  different  total  numbers  of  missiles, 
or  by  proceeding  from  a  different  node  such  that  the  same  point  in  the 
firing  strategy  is  reached.  Consider  a  "keyed"  firing  strategy 
S3,  2,  1,  1,  1,  ...,  where  the  transition  between  volley  sizes;  e.g., 
between  the  volley  of  three  missiles  and  the  volley  of  two,  is  not 
made  until  there  has  been  a  change  of  state;  i.e.,  until  at  least  one 
plane  has  been  shot  down  as  .?  result  of  firing  missiles  in  volleys  of 
three.  This  information  is  notmally  available  from  a  continuous  wave 
illuminating  radar,  since  a  falling  tone  Indicates  that  one  or  more 
(but  not  how  many)  planes  has  been  killed.  The  keyed  strategy  does, 
of  course,  require  the  operator  to  wait  until  the  present  volley  of 
missiles  has  reached  the  target  area  before  firing  the  next  volley. 

For  the  case  of  four  targets  and  firing  strategy  S3,  2,  1,  1,  1 . 

this  necessitates  two  nodes  for  state  two  and  two  nodes  for  state 
three,  aa  can  be  seen  from  the  diagram.  Although  self-loops  and 
forward  paths  are  thereby  added  to  the  flow  diagram,  the  calculations 
do  not  become  conceptually  more  complicated,  but  merely  more  voluminous. 
Since  an  electronic  computer  would  ordinarily  be  use!  to  evaluate  system 
gain,  using  Mason's  rule,  for  situations  involving  a  large  number  of 
aircraft  or  a  complicated  firing  strategy,  this  is  not  *  serious  draw¬ 
back.  Indeed,  the  chief  advantage  of  the  method  is  that  the  complexity 
of  the  calculations  does  not  increase  in  proportion  to  the  number  of 
states  in  the  system  and  the  complexity  of  the  strategy.  If  instead 
of  a  keyed  non-uniform  strategy,  one  uses  a  "pure"  strategy,  in  which 
the  transition  between  volley  sizes  is  independent  of  changes  in  state, 
it  is  necessary  to  provide  additional  split  nodes  to  accommodate  the 
paths  representing  misses  by  all  missiles  in  a  volley.  This  type  of 
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path  will  no  longer  be  a  self-loop  to  the  same  node,  but  will  lead  to 
a  separate  node  representing  the  same  state  but  requiring  that  a  dif¬ 
ferent  number  of  missiles  be  fired.  For  example,  for  the  strategy 
S3,  2,  1,  1,  1,  the  self-loop  to  node  0  would  now  lead  to  subse¬ 

quent  nodes  by  the  strategy  S2,  1,  1,  1,  ...,  finally  reaching  node  IV. 
Similarly,  the  particular  path  from  node  OB  which  represents  two 
misses  would  not  be  a  self-loop,  but  would  lead  to  node  OC,  which 
would  then  lead  to  subsequent  nodes  by  the  51,  1,  1,  ...  strategy 
depicted  in  Figure  1.  The  paths  representing  misses  by  all  missiles 
at  other  nodes  would  be  treated  in  the  same  manner.  This  case  la  not 
worked  out  here  since  it  ad  Is  nothing  to  the  explanation  of  the 
technique,  merely  representing  a  straightforward  extension  of  the 
diagram  with  no  difference  in  the  manner  of  solution  except  that  it 
requires  more  steps  In  the  computer  program. 

The  above  four  strategies  were  analyzed  for  a  formation  of  four 
attacking  aircraft.  The  SSKP  was  taken  as  .75,  and  the  multiple 
target  weighting  factors  were  taken  as  0.9,  0.8,  0.5,  and  1.0  for 
states  0,  1,  2,  and  3,  respectively,  in  accordance  with  the  fact, 
explained  above,  that  the  multiple  target  effectB  become  less  pro¬ 
nounced  as  the  number  of  targets  increases.  Therefore,  the  resulting 
kill  probabilities,  p  ,  were  .675,  600,  .375,  .750,  and  0  for  states 

0,  1,  2,  3,  and  4,  respectively,  being  0  in  state  4  since  there  are 
no  remaining  aircraft  at  this  point.  The  diagrams  were  used  to 
calculate,  for  each  firing  strategy,  the  probability  of  killing  all 
four  targets  as  a  function  of  the  number  of  missiles  fired.  As  an 
example,  the  flow  diagram  for  S3,  2,  1,  1,  1,  ....  after  assigning 
path  values  and  combining  parallel  paths,  is  shown  in  Figure  5. 

The  system  gain  can  now  be  found  by  node  absorption.  Mason's  rule, 
or  a  combination  thereof.  For  example,  the  quotient  of  polynomials 
obtained  for  S3,  3,  3,  111,  was: 

0.224x6  +  0.418x9  +  0.052x12 

A  1  -  1.360x3  +  0.391x6  -  0.034x9  +  0.00 lx12 

-  .224x6  +  . 722x9  +  .946x12  +  .963x5  +  .964X1? 


Thus,  the  probability  of  killing  all  four  targets  with,  for  example, 
twelve  missiles  fired  three  at  a  time  was  .946.  The  results  of  the 
calculations  are  shown  in  Figure  6.  It  is  seen  that  SI,  1,  1,  ..., 
provides  a  higher  probabilityof  killing  all  four  targets  than  do  the 
other  strategies  when  the  number  of  missiles  to  be  fired  is  seven  or 
less.  However,  the  probability  then  levels  off  rather  sharply,  and  a 
great  many  missiles  would  be  necessary  in  order  to  exceed  a  probability 
of  .8.  The  curves  for  S2,  2,  2,  ...,  and  S3,  3,  3,  ...,  have  the  same 
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general  shape  as  chat  for  SI,  1,  1,  . except  that  they  tend  to  level 
off  at  a  higher  range  of  values.  However,  It  would  still  be  necessary 
to  fire  a  large  number  of  missiles  in  order  to  attain  a  probability  in 
excess  of  .9.  The  curve  for  S3,  2,  1,  1,  1,  ...  tendB  to  level  off  at 
a  high  range  of  values,  and  it  has  the  advantage  of  rising  more  quickly 
to  this  range.  The  reason  for  this  is  fairly  obvious,  since  this 
strategy  calls  for  firing  a  large  volley  at  first  and  then  smaller 
volleys,  taking  advantage  of  the  fact  that  the  energy  centroid  of  the 
targets  tends  to  remain  more  in  the  center  of  the  formation  if  the 
number  of  targets  is  large  and  that  the  probability  of  more  than  one 
missile  locking  on  the  same  target  is  lwer  for  a  large  number  of 
targets.  Although  these  effects  are  Intuitively  clear,  the  exact 
manner  in  which  they  interact  is  not,  and  it  is  apparent  that  further 
analysis  along  the  lines  suggested  by  Figure  6  would  lead,  by  a  kind 
of  dynamic  programming  process,  to  the  optimal  firing  strategy  for  any 
given  situation  if  one  is  trying  to  maximize  the  probability  of  killing 
all  four  targets  by  firing  a  certain  number  of  miasiles.  If  one  is 
Interested  in  the  probability  of  killing  some  specific  number  of  the 
attackers  Instead  of  all  of  them,  as  a  function  of  firing  strategy  and 
number  of  missiles  fired,  it  is  necessary  only  to  delete  all  flow  graph 
nodes  representing  a  number  of  kills  greater  than  this. 

If  one  is  attempting  to  optimize  some  other  aspect  of  the  situation, 
the  information  is  generally  available  from  Figure  6.  For  Instance,  the 
expected  kills  per  missile  are  plotted  for  each  strategy  in  Figure  7. 

It  is  seen  that  strategy  S3,  2,  1,  1,  1,  ...  provides  a  higher  number 
of  expected  kills  per  missile  than  the  others  if  five  or  more  missiles 
are  fired.  In  order  to  obtain  the  true  mathematical  expectation,  of 
course,  one  would  also  need  the  probability  of  killing  three,  two,  and 
one  of  the  attacking  aircraft,  which  would  necessitate  calculations  of 
the  system  functions  to  nodes  I,  IX,  and  III.  The  main  contribution, 
however,  is  provided  by  the  probability  of  killing  all  four  of  the  air¬ 
craft,  and  the  true  expectations,  although  somewhat  higher  than  the 
ones  in  Figure  7,  would  not  differ  from  them  qualitatively,  and  one 
would  not  ordinarily  require  a  refinement  of  this  nature  until  it  was 
apparent  that  the  optimal  strategy  had  been  approached. 

Although  the  above  technique,  utilizing  representation  of  transition 
probabilities  by  signal  flow  graphs  and  subsequent  application  of  Mason's 
rule  to  calculate  system  functions  which  indicate  the  effectiveness  of 
the  relevant  strategies,  was  used  in  conjunction  with  missile  firing 
strategies  in  this  case,  it  is  readily  seen  that  it  is  applicable  to  a 
variety  of  problems  arising  in  military  operations  research  and  in  other 
situations  involving  stochastic  duels  and  programming  under  conditions 
of  uncertainty.  It  also  provides  a  facile  method  for  analyzing,  by 
means  of  an  electronic  computer,  the  effects  of  a  change  in  strategy 
(or  programming)  or  of  engagement  parameters  or  program  elements  and 
therefore  is  amenable  to  gaming. 
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PROBABILITY 


FIO  «.  PROBABILITY  OF  U.LLINS 
ALL  FOUR  TARGETS  AS  A 
FUNCTION  OF  THE  NUMBER 
OF  MISSILES  FIRED  USINO 
VARIOUS  ENCASEMENT 
8TRATEQIE8 
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EXPECTED  KILLS 
PER  MISSILE 


FIQ.T.  COMPARISON  OF 
EXPECTED  KILLS 
PER  MISSILE  FOR 
VARIOUS  STRATEGIES. 
PROVIDES  HEURISTIC 
INDICATION  OF  MORE 
OPTICAL  STRATEGIES, 
ENABLING  ITERATIVE 
OPTIMIZING  PROCEDURE. 
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THEORY  AND  ASSUMPTIONS  UNDERLYING 
THE  DEVELOPMENT  OF  CSP-R* 

Harold  W.  Kelley  and  Fred  L.  Abraham 
U.  S.  Army  AmmurH tion  Procurement  and  Supply  Agency 
Joliet,  Illinois 


1.0  INTRODUCTION 

This  memorandum  discusses  the  development  of  CSP-R,  a  continuous 
sampling  procedure  Involving  normal,  tightened,  and  reduced  sampling  Inspection. 
The  memorandum  discusses  some  of  the  considerations  that  led  to  its  development 
and  the  objectives  set  for  the  procedure  during  development.  It  also  provides 
the  necessary  mathematical  derivations  used  in  the  development.  CSP-R  plans 
will  appear  in  MIL-STD-1235A,  "Continous  Sampling  Procedures  and  Tables  for 
Inspection  by  Attributes." 

2.0  BACKGROUND 

2.1  Reduction  in  Sampling  Inspection 

When  confidence  has  been  established  that  a  manufacturing  process  is 
stable  and  Is  producing  a  small  percentage  of  defective  material,  the  user  of 
continuous  sampling  plans  often  has  the  desire  to  reduce  the  amount  of  sampling 
inspection  being  done. 

2.2  CSP-M 

MIL-STD-1235  contains  a  multi-level  sampling  plan,  CSP-M,  which  allows 
such  reduction  in  sampling  inspection.  In  spite  of  this  feature,  a  survey  of 
Army  Ammunition  Plant;  inspection  elements  indicated  that  CSP-M  was  considered 
too  complicated  in  terms  of  its  administration  to  be  useful.  For  this  reason, 
the  CSP-M  plans  were  generally  Ignored. 

From  a  technical  point  of  view,  CSP-M  contains  another  weakness;  it  is 
not  very  responsive  to  a  deterioration  in  quality  if  one  of  the  reduced  sampling 
states  has  been  reached.  As  an  example,  suppose  that  we  are  inspecting  at 
sampling  rate  level  number  five,  AQL  ■  .25%,  i  ■  287.  Suppose  that  a  previously 
low  process  average  shifted  to  12,  or  four  times  the  AQL.  The  probability 
of  continuing!  on  one  hundred  percent  inspection  after  finding  a  defect  is  only 
.00000000946.  In  fact,  there  is  only  an  802  probability  that  the  1002  inspection 

^-that  is,  going  progressively  through  the  checking  states  to  the 
1002  inspection  level. 


*This  article  has  previously  appeared  as  Technical  Memorandum  QEM  21-230-6. 
The  remainder  of  this  paper  has  been  reproduced  photographically  from  the 
author's  copy. 
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level  will  be  reached  without  tlrst  reaching  a  certain  level  K  or  scar  state 
(say,  level  3)  and  then  reverting  to  a  lower  level  (level  4). 

2.3  CSP-1 

The  simplest  continuous  sampling  plan  is,  of  course,  CSP«1  wherein 
the  finding  of  1  consecutive  defect  free  units  on  100%  inspection  allows 
sampling  inspection  to  begin,  during  which  the  finding  of  a  defect  causes 
a  reversion  to  100%  or  screening  inspection.  CSP-1,  however,  does  not  allow 
a  decrease  in  sampling  inspection.  Using  a  CSP-1  plan  with  the  same  AOQL 
but  with  a  smaller  sampling  frequency  may  be  a  solution,  but  indiscriminate 
shifting  between  plans  without  specified  rules  based  upon  the  mathematical 
impact  of  such  shifting  is,  of  course,  not  desirable. 

2.  A  CSP-2 


CSP-2,  while  not  allowing  a  reduction  in  the  sampling  frequency,  does 
delay  the  resumption  of  screening  inspection  under  certain  circumstances. 

This  feature  is  desirable  in  those  situations  where  an  alert  of  the  screening 
crew  seems  necessary,  but  it  offers  no  special  advantages  insofar  as  allowing 
a  reduction  in  sampling  inspection. 

2.5  MIL— STD-105D 


MIL-STD-105D  allows  a  reduction  in  lot-by-lot  sampling  via  the  reduced 
sampling  technique.  A  history  of  good  product  quality  allows  a  reduction  in 
sample  sizes  for  subsequent  inspections.  At  the  same  time,  a  history  of 
marginal  product  quality  causes  a  tightened  inspection  to  be  initiated.  This 
tightened  inspection  sometimes  requires  a  larger  sample  size,  but  in  all 
cases  the  probability  of  accepting  a  lot  with  a  given  percent  defective2  is 
lower  under  tightened  sampling  inspection. 

3.0  OBJECTIVES 

Consideration  of  the  points  mentioned  above  led  to  some  general  ideas 
about  what  kinda  of  characteristics  a  continuous  sampling  procedure  should 
have,  if  this  continuous  sampling  procedure  were  to  allow  a  reduction  in 
sampling  inspection  after  demonstration  of  a  low  process  average. 

3.1  Responsiveness 

The  procedure  should  be  responsive  to  an  undesirable  shift  in  the 
process  average.  This  feature  could  be  obtained  by  requiring  a  screening 
sequence  after  finding  a  defect  on  a  sampling  sequence. 


2-other  than  0%  or  100%  defective. 
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3.2 


Simplicity 


The  procedure  should  be  both  simple  in  design  end  relatively  easy  to 
administer.  Although  simplicity  is  a  somewhat  subjective  conceptt  it  would 
seem  that,  generally  speaking,  the  fewer  inspection  states  a  procedure  has,  the  l 

simpler  the  procedure  would  be.  likewise,  a  procedure  with  simple  rules  for  2 

switching  between  sampling  and  screening  states 3is  simpler  than  one  which  s 

requires  check  states  or  similar  devices.  It  was  felt,  therefore,  that  a  procedure 
with  a  relatively  few  number  of  states,  with  the  switching  rules  similar  to  those 
of  CSP-1,  would  satisfy  the  objective  of  simplicity. 

3. 3  Average  Outgoing  Quality  Limit 

The  development  of  the  procedure  should  be  based  on  the  concept  of  an 
average  outgoing  quality  limit  (AOQL),  not  only  to  provide  a  limit  to  average 
outgoing  quality  which  will  not  be  exceeded  no  matter  what  quality  of  product  -j 

is  submitted  for  inspection,  but  also  to  establish  correspondence  with  CSP-1 
plans  and  other  continuous  sampling  plans  from  which  a  user  can  make  a  choice.  > 

3. 4  Relationship  with  CSP-1 

Common  sense  dictated  that  the  procedure  require  less  inspection  than  some 
norm  for  product  of  high  quality  and  more  inspection  for  product  of  marginal 
quality.  Accordingly,  it  appeared  reasonable  that  the  first  step  of  the  develop¬ 
ment  would  be  establishment  of  a  norm.  CSP-1  was  selected  as  this  norm  because  it 
is  the  most  widely  used  of  existing  CSP's  by  Army  Ammunition  Plants  inspection 
elements . 

The  attainment  of  this  objective  could  be  demonstrated  by  a  comparison 
of  Average  Fraction  Inspected  (AFI)  curves  for  the  developed  plans  with  AFI 
curves  for  corresponding  CSP-1  plans1.'  An  AFI  curve  shows  the  percentage  of 
units  inspected  over  the  long  run  when  the  process  average  is  of  a  certain  value. 

.3 . 5  Relationship  with  the  Normal-Tlghtened-Reduced  Concept  of  MIL-STD-105 

Purely  as  a  matter  of  standardization,  it  was  decided  to  develop  the 
procedure  along  the  lines  of  the  normal-tightened-reduced  concept  of  MIL-STD-105D. 

Users  of  MIL-STD- 105D  could  adapt  easily,  therefore,  should  they  have  occasion  to 
use  this  procedure  in  MIL-STD- 1235A. 

3CSP-i,  for  example,  is  the  epitome  of  simplicity  in  this  regard, 

UA  graphical  illustration  of  thi3  comparison  is  given  in  [7.1). 
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4.0  THE  DEVELOPMENT 

With  Che  objectives  above  In  mind,  development  of  the  procedure  began. 
Several  models  were  formulated  and  weighed  against  the  objectives  stated.  Actually, 
most  of  the  objectives  could  be  satisfied  simply  by  designing  them  Into  the 
procedure.  The  steps  used  to  evaluate  each  model  in  terms  of  its  statistical 
properties  are  discussed  below. 

4.1  Determining  the  Parameters 

After  a  general  procedure  was  defined,  which  would  satisfy,  by  its 
construction,  most  of  the  objectives,  it  became  necessary  to  investigate  the 
procedure's  relationship  with  CSP-1.  In  order  to  satisfy  the  objective  concerned 
with  this  relationship,  representative  examples  of  CSP-1  plans  were  selected.  The 
AOQL'b  for  these  plans  were  used  in  determining  the  parameters  (sampling 
frequencies  end  clearance  numbers  for  the  plans  based  upon  the  procedure  under 
investigation.  Accordingly,  the  AOQ  formula  for  each  procedure  had  to  be  developed 5 
and  the  parameters  subjected  to  variation  until  the  maximum  resulting  AOQ  for  any 
value  of  the  process  average,  p,  was  close  to  the  target  AOQL,  In  general,  the 
sampling  frequencies  were  held  fixed  and  the  clearance  numbers  were  allowed  to 
vary.  As  can  be  seen  from  a  study  of  Appendices  A  and  B,  this  was  no  small  task. 

4.2  Computing  the  AFI  Curves 

Upon  the  determination  of  the  parameters  of  the  plan,  the  AFI  formula 
developed  prior  to  developing  the  AOQ  formula6was  used  to  find  several  points 
of  the  AFI  curve  for  the  plan.  The  AFI  curves  were  then  drawn  on  graph  paper. 

4.3  Comparing  AFI  Curves 

After  determining  the  AFI  curve  for  the  plan  under  test,  the  AFI 
curve  for  the  corresponding7CSP-l  plan  was  drawn  on  the  same  sheet  of  graph  paper, 
and  the  results  were  compared. 

As  discussed  in  3.4  above,  it  was  dasired  that  a  plan  based  on  the 
developed  procedure  require  less  Inspection  then  a  corresponding  CSP-1  plan  for 
product  of  good  quality  and  more  inspection  than  CSP-1  for  product  of  marginal 
quality.  Expressing  this  mathematically,  we  want 

AFI  (of  CSP-1)  >  AFI  (of  developed  plan)  for  p  <  pQ,  and 

AFI  (of  CSP-1)  <  AFI  (of  developed  plan)  for  p  >  pQ,  where  pg 

would  be  the  "dividing  line"  of  good  and  marginal  quality.  It  was  desired  to 

sSee  Appendices  A  and  B  for  the  work  involved  in  deriving  the  AOQ  form  ila 

for  the  selected  procedure. 

6See  Appendix  B  for  the  AFI  formula  of  the  selected  procedure. 

7The  method  of  establishing  the  correspondence  was  defined  for  each 
procedure  but  in  each  case  depended  on  the  AOQL. 
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keep  p0  within  the  interval  (0,  pj_) ,  where  p^  is  the  value  of  the  process  average 
for  which  the  AOQ  is  equal  to  the  AOQL.  This  choice,  though  arbitrary,  seemed 
reasonable. 


4.4  Selection  of  Procedure 


A  procedure  was  finally  selected  which  m03t  satisfactorily  fulfilled 
the  objectives.  This  procedure  was  designated  CSP-R,  and  is  described  in  block 
diagram  form  in  Figure  I.  This  procedure,  while  generally  satisfying  all  of 
the  objectives,  does  not  strictly  satisfy  the  objective  relative  to  the  AFI  curves 
when  the  clearance  number  is  very  small  and  at  the  same  time  the  sampling  frequency 
is  very  large?  Since  plans  with  these  parameters  are  not  used  extensively, 
this  limitation  did  not  seem  restrictive. 

5.0  THE  PROCEDURE 

Although  Figure  I  seems  very  self-explanatory,  discussion  of  some  of  the 
features  of  CSP-R  seems  in  order. 

There  are  three  sampling  states:  normal,  tightened,  and  reduced,  and 
three  screening  states:  qualification,  recrial,  and  tightened.  It  can  be  seen 
that  the  three  sampling  states  are  parallel  to  the  normal-tightened-reduced 
concept  of  MIL-STD-105D,  and  this  is,  in  fact,  why  they  are  labelled  as  such.  The 
rationale  for  the  three  screening  states  can  he  found  in  the  discussion  below. 

5.1  How  the  Procedure  Operates 

Entrance  into  the  inspection  states  was  designed  to  be  dependent  upon 
the  demonstrated  capability  of  the  production  process,  as  evidenced  by  favorable 
or  unfavorable  inspection  results.  Under  the  system,  the  qualification  state 
is  initially  entered.  When  evidence  indicates  the  quality  of  an  item  haB  stabilized 
at  a  satisfactory  level,  normal  sampling  is  initiated.  Continued  evidence  of 
the  process's  capability  to  produce  satisfactory  or  better  quality  permits  the 
reduced  sampling  state  to  be  entered,  Once  reduced  sampling  is  initiated,  aampling 
remains  in  effect  until  a  defect  is  found,  at  which  time  the  system  immediately 
invokes  its  qualification  screening  provisions. 

The  tightened  inspection  phase  of  the  system  was  also  designed  to  be 
entered  from  the  normal  inspection  phase.  However,  tightened  inspection 
provisions  are  invoked  only  when  defect(ive)s  fall  too  closely  together}  that 
is,  when  the  separation  of  defect(lve)s  is  less  than  a  prescribed  minimum 
spacing.  Tightened  screening  remains  in  effect  until  sufficient  evidence  indi¬ 
cates  the  process  is  capable  of  generating  an  item  of  at  least  marginal  quality. 

Once  this  evidence  is  established,  tightened  sampling  is  initiated.  The  normal 
sampling  state  may  then  be  re-entered  if  evidence  of  favorable  inspection 

8-those  plans  in  MIL-STC-1235  associated  with  large  AQL's  and  the  lower 
code  letters. 
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continues.  If  not,  the  system  invokes  its  qualification  screening  provisions 
and  continues  as  before. 

Similarity  of  provisions  governing  transitions  between  states  in  CSP-R 
and  those  associated  with  the  MIL-STD-105D  scheme  is  apparent.  However, 
under  the  MIL-STD-105D  scheme  there  is  a  transition  from  reduced  to  normal 
sampling  not  only  upon  an  unfavorable  inspection  result  (rejection  of  a  lot),  but 
also  upon  acceptance  under  the  procedures  of  10.1.4  of  that  Standard?  We 
therefore  see  that  the  reduced  state  is  entered  with  difficulty,  but  left  inmediataly 
should  doubt  arise  as  to  the  continued  high  quality  of  material.  The  analogous 
CSP-R  provision  is  the  requirement  of  re-entrance  into  the  qualification  screening 
state.  This  provision,  though  admittedly  drastic,  was  established  to  assure 
performance  of  sufficient  screening  to  guarantee  that  the  previously  good 
quality  level  had  not  deteriorated. 

The  retrial  screening  provision  of  CSP-R  was  designed  to  represent  a 
reasonable  balance  between:  (1)  the  need  for  assurance  of  the  previously  es¬ 
tablished  quality  level  for  normal  sampling  and  (2)  a  desire  to  avoid  e  premature 
decision  to  invoke  the  tightened  provisions. 

5.2  Properties  of  the  Parameters 

In  common  with  most  CSP  plans,  those  of  CSP-R  were  developed  to  be  based 
on  AOQL  and  defined  by  the  parameters  fj  and  i^  ,  where  fj  is  the  sampling 
frequency  in  the  jth  sampling  state  and  is  the  clearance  number  in  the  kth 
screening  state.  Also,  in  common  with  most  CSP  plans,  the  parameters  fj  and  i^ 
of  CSP-R  plans  determine  the  AOQ  function,  as  discussed  previously. 

To  maintain  the  normal,  tightened  and  reduced  Inspection  concept,  the 
following  relationship  among  sampling  frequencies  was  used:  f^  >  f ^  *  f^;  where 
the  subscripts  T,  N,  and  R  refer  to  tightened,  normal,  and  reduced  sampling, 
respectively.  Since  CSF-1  had  been  established  as  the  norm,  it  was  decided  to 
equate  f^  of  CSP-R  to  f  of  CSP-1  for  equal  AOQL  and  production  interval  size. 
Consequently,  sampling  rates  f>j  ,  ffl  and  fR  in  CSP-R  could  be  the  frequencies 
for  any  three  consecutive  code  letters  under  CSP-1  for  a  given  AOQL.  This  was 
conducive  to  simplicity. 

Two  values  of  1^  were  established  for  the  procedure:  i  and  i*.  The 
relationship  between  i  and  i*  is  i*  ■  1/2  (with  a  few  exceptions).  The  choice 
of  tills  relationship  between  i  and  i*  was  predicated  upon  the  need  for  more 
stringent  requirements  for  entering  reduced  sampling  than  for  entering  tight¬ 
ened  inspection.  It  had  been  noted  that  the  MIL-STD-105D  scheme  generally 
requires  ten  consecutively  accepted  lots  (plus  the  defects  in  these  ten  lots 

s-that  is,  when  there  is  not  strong  evidence  that  quality  is  superior. 
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being  less  chan  a  prescribed  minimum  number)  under  normal  sampling  to  qualify 
for  reduced  sampling,  but  only  five  consecutively  accepted  lota  on  tightened  no 
re-enter  normal.  Hence,  the  relationship  between  1  and  1*  followed  by  analogy. 

In  addition,  It  was  noted  that  the  MIL-STD-105D  schema  Invokes  tightened  inspection 
provisions  if  any  two  of  five  (analogous  to  1*)  consecutive  lots  are  rejected 
on  normal.  CSP-R  was  designed  to  require  tightened  Inspection  when  defect(ive)s 
are  separated  by  fewer  than  1*  units;  one  defect(ive)  being  permitted  in  normal 
sampling  but  not  another  in  re-trial  screening. 

6.0  DERIVATION  OF  FORMULAE 

As  mentioned  previously,  the  development  of  CSP-R  required,  upon  setting 
up  a  hypothetical  procedure,  the  determination  of  the  mathematical  properties  of 
the  procedure,  so  that  appropriate  comparisons  could  be  made. 

6.1  The  Flow  Diagram 

The  first  step  in  constructing  the  appropriate  mathematical  model  would 
be  to  outline  the  procedure  in  flow  diagram  form.  Figure  I  is  the  flow  diagram 
of  CSP-R. 


6-2  Events  Causing  a  New  State/Phase  to  be  Entered 

The  next  step  is  to  look  over  each  of  the  blocks  In  the  flow  diagram 
and  determine  the  events  causing  a  state  and/or  phase  to  be  entered.  As  used 
herein,  "state"  refers  to  either  qualification,  retrial,  or  tightened  screening, 
or  normal,  tightened  or  reduced  sampling.  "Phase"  refers  to  either  the  units 
inspected  during  a  sampling  state,  or  the  units  skipped  during  a  sampling  state. 

Figure  II  shows  trie  events  laid  out  in  matrix  form.  The  following 
notation  h.*s  been  used  in  Figure  II: 

0  ■  Qualification  state 

N  -  Normal  sampling  state 

T  -  Tightened  sampling  state 

R  -  Reduced  sampling  state 

N*  ■  Retrial  screening  state* 

T*  -  Tightened  screening  state 

The  subscripts  I  and  S  in  Figure  II  pertain  to  phases  of  sampling 
states.  I  denotes  the  phase  when  a  unit  is  being  inspected,  and  S  denotes  the 
phase  when  the  units  are  being  skipped. 
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EVENTS  CAUSING  A  STATE/PHASE  TO  BE  ENT  BRED/RE ENTERED 


I 
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units  defeat 


6.3 


State  Probabilities 


The  next  step  is  to  develop  formulae  for  determining  the  percentage 
of  units,  over  the  long  run,  which  will  reach  the  point  of  inspection  during  each 
of  the  states.  The  development  of  these  formulse  for  CSP-R  is  shown  in  Appendix  A. 

6.4  The  AFI 


Next,  the  AFI  formula  must  be  developed.  This  development  is  shown  in 
Appendix  B  for  CSP-R.  The  resultant  formula  is 

AFI  -  P0  +  PN*  +  PNJ+  PT*  +  PTI  +  pRi 

where  Pj  is  the  state  probability  of  state  j,  where  the  subscripts  are  defined 
as  in  6.2  above. 

6.5  The  AOQ 

Upon  determining  the  expression  for  the  AFI,  the  AOQ  formula  can  be 
constructed  rather  simply.  This  is  shown  in  Appendix  B.  The  resultant  formula  is 

ao«  -  ' 

where  p  is  the  probability  of  a  defective  unit. 

6.6  Determining  the  Parameters 

Using  a  certain  value  of  AOQL  and  establishing  values  for  the  sampling 
frequencies,  the  AOQ  formula,  through  an  iterative  process,  was  used  to  develop 
the  values  of  1  and  1*  for  the  CSP-R  plans  which  will  appear  in  HIL-STD-1235A. 

It  should  be  pointed  out  here  that  the  AOQL's  used  in  MIL-STD-1235A  are  generally 
less  than  the  corresponding  values  in  MIL-STD-1235 .  This  is  because  the  AOQL's 
in  MIL-STD-1235A  have  been  matched  (with  certain  limitations)  to  the  AOQL's  of 
the  MIL-STD-105D  single  sampling  schemes  (with  the  same  AQL) ,  treating  the  scheme 
as  encompassing  normal,  tightened,  and  reduced  inspection.  The  effect  of  tightened 
inspection  caused  the  resultant  AOQL's  to  be  lower. 

6.7  Computing  the  Curve  Points 

During  development,  the  curve  points  (AFI  and  AOQ)  were  computed  for 
certain  representative  plans.  Upon  selection  of  the  CSP-R  procedure,  curves  for 
each  of  the  plans  were  computed  on  the  Agency's  RCA  501  digital  computer.  Addi¬ 
tionally,  Operating  Characteristic  (0C)  Curves  were  computed.  The  derivation  of 
the  formula  for  the  OC  Curves  appears  in  Appendix  C.  These  curves,  should  they 
appear  in  MIL-STD-1235A,  will  show  the  percentage  of  units  accepted  on  a  sampling 
basis,  for  each  value  of  the  process  average,  p. 
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6.8 


Assumptions  Used  -In  the  Derivations 


Throughout  this  discussion,  we  will  assume  Figure  I  defines  an  ergodlc 
Markov  process.  Thus,  after  many  steps  have  occurred  In  the  system  the  prob¬ 
ability  of  being  in  any  given  state  of  CSF-R  tends  to  become  a  steady  state 
probability  which  is  independent  of  the  number  of  steps  but  dependent  upon  the 
state  in  which  the  system  was  at  the  last  step  and  upon  the  transitional  prob¬ 
abilities  . 

We  will  further  assume: 

(1)  All  items  are  classified  correctly,  i.e.  defect(ive)  or 
non-defect (ive) j 

(2)  The  production  process  is  in  statistical  control; 

(3)  When  sampling  is  in  effect,  every  l/f,th  unit  is  inspected  with 
screening  required  to  begin  with  the  next  unit  after  a  defective 
is  observed  (see  below);  and 

(4)  Defective  units  found  are  removed  but  not  replaced  by  non¬ 
defectives. 

We  will  digress  here  to  briefly  discuss  the  effect  of  these  assumptions. 

The  assumptions  above  have  been  adopted  largely  because  they  lead  to  the 
simplest  mathematics.  However,  the  use  of  these  assumptions  does  not  imply  that 
CSF-R  plans  are  Invalid  if  conditions  other  than  those  assumed  apply.  What  their 
use  does  imply  is  simply  that  the  plans  have  been  designed  with  these  conditions 
in  mind.  Deviations  from  the  stated  conditions  will,  in  general,  affect  the 
AF1  function  and  result  in  values  of  AOQL  higher  than  the  theoretical  values  com¬ 
puted  from  formulae  derived  herein.  Although  the  modifications  of  the  theoretical 
AOQL  values  resulting  from  such  deviations  have  not  been  thoroughly  explored,  some 
treatment  of  alternatives  has  been  made  [7.11],  [7.12],  [7.13],  [7.14],  [7. IS], 
[7.16]. 

Assumption  (3)  above  has  been  adopted  solely  for  mathematical  convenience. 
It  Is  recognized  that  the  theoretically  bust  method  of  sampling  would  be  proba¬ 
bilistic,  i.e.,  each  unit  would  be  inspected  with  probability  fj,  independent  of 
other  units.  However,  strict  adherence  to  this  method  in  an  actual  production 
situation  would  be  impractical,  if  not  impossible.  In  some  instances,  block  (or 
group)  sampling  may  be  required;  in  others,  probabilistic  or  the  assumed  systematic 
sampling  method  may  be  in  order.  Thus,  MIL-STD-1235  provides  for  the  selection  of 
sample  units  "so  as  to  give  each  unit  of  product  an  equal  chance  of  being  inspected" 
with  the  inspector  allowing  the  interval  between  sample  units  to  vary  somewhat 
rather  than  drawing  "sample  unlt6  according  to  a  rigid  pattern."  The  effect  of 
assumption  (3)  is  to  provide  AOOL  values  of  the  same  magnitude  as  those 
computed  under  the  assumption  of  probability  sampling. 
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APPF.NDT Y  A 


DERIVATION  OF  STATE 
PROBABILITY  FORMULAE 


A.l 


A. 1.1 
define 


and  for 


GENERAL 

In  deriving  Che  steady  state  and  state  entrance  probabilities,  we  will 

p  “  Probability  of  a  defective  unit; 

q  •  1-p  m  probability  of  a  non-defective  unit; 

i  ■  clearance  number  for  states  0,  N; 
i*  *  clearance  number  for  states  N*,  T*,  T; 

i  -  0,  N,  N*,  T,  T*,  R,  let 

Pj  -  Prob.  (being  in  state  J  on  the  present  step); 

Pj  ,»  Prob.  (entering  state  j); 

fj  *  the  sampling  rate  for  state  J. 


A  step  will  be  defined  as  the  inspection  of  a  unit  of  product. 


A. 1.2  When  the  process  is  in  states  j  ■  N,  R,  T,  some  units  are  being  skipped 

(passed)  while  others  are  being  sampled  and  inspected.  In  the  derivations  the 
skipped  unit  possibilities  in  these  states  will  be  considered.  It  is  convenient, 
therefore,  to  partition  states  j  ■  N,  R,  T,  into  skipping  and  sampling  phases. 


Let 


and 


Then, 


Pj  -  Prob.  (being  in  the  skipping  phase  of  state  j); 

s 

Pa  -  Prob.  (being  in  the  sampling  phase  of  state  J). 
J I 

pj  "  pjs  +pJr 
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Moreover,  it  is  convenient  to  partition  the  skipping  phase,  Js,  into  skip  unit 
phase  one  and  skip  unit  phase  two.  Therefore,  let 


PJS0  “ 

and 

Then, 


Prob.  (being  in  skip  unit  phase  one  of  state  j); 
Prob.  (being  in  skip  unit  phase  two  of  state  j). 


for 


J 


N,  R,  T. 


Skip  unit  phase  one  will  be  defined  as  that  phase  of  j  initially  entered, 
and  skip  unit  phase  two  will  be  defined  as  that  phase  of  Jg  in  which  all 
subsequent  skips  occur.  Skip  unit  phase  one  may  therefore  be  viewed  as  being 
a  "transitional"  phase  between  the  last  step  in  some  previous  state  and  the  first 
step  in  the  present  state. 

The  preceding  state/phase  symbols  with  primes  will  be  used  to  denote  the 
probability  of  entering  a  given  state/phase  on  the  present  step. 


A. 2 


(1)  Pq  ■  Prob.  (JuBt  entering  state  0  on  the  last  step)  + 

Prob.  (entering  0,  Lwo  steps  ago,  and  inspecting  a 
non-defective  on  the  last  step)  +  .  .  .  + 

Prob.  (entering  0,  1  steps  ago,  and  inspecting  i-1 
consecutive  non-defective  units) 


po  +  +  p6<iz  +  •  •  •  + 


-  P’  (l-q^/p 

(2)  Pjj  -  Prob.  (being  in  the  sampling  phase  of  state  N)  + 

Prob.  (being  in  the  skip  unit  phase  of  state  N) 
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(3)  PN  -  Prob.  (just  entering  phase  Nj  on  the  last  step) 

1  +  Prob.  (entering  phase  Nj  two  steps  ago  and 

Inspecting  a  good  unit  on  the  last  step) 

+  .  .  .  +  Prob.  (entering  N^,  1  steps  ago,  and 
inspecting  1-1  consecutive  non-defective  units) 

*Nj  +  V  +  +  •  •  •  + 

-  pnx  (l-q^/p* 

(4)  P«  «  Prob,  (entering  skip  unit  phase  of  N  and  passing 

s  (skipping)  the  next  (l/fjj)-l  units) 

-  Pns  tU/fN)-l]. 

Similarly  then, 


(5) 

PT 

m 

pTl 

+  Pm 

Ts 

(6) 

PTl 

m 

Pt, 

(l-q^J/p 

(7) 

PTS 

m 

P*s 

f(l/fT)-l] 

(8) 

PR 

m 

Pri 

+  pr 

RS 

(9) 

PrI 

- 

PR 

RI 

(1/P) 

(10) 

Prs 

as 

Pr 

Rs 

[(l/fR)-l] 

(11) 

PT* 

■ 

?T  * 

(l-qJ*)/p 

(12) 

V 

■ 

P  ’ 

N* 

(l-q1*)/? 
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EXPRESSIONS  FOR  THE  STATE  ENTRANCE  PROBABILITIES 


With  Che  aid  of  FIGURE  II  we  obtain 

(13)  -  Prob.  (being  in  state  0  and  finding  a  defective  unit)  + 

Prob.  (being  in  state  R  and  finding  a  defective  unit)  + 
Prob.  (being  in  state  T  and  finding  a  defective  unit) 


P0‘P  +  PRt-P  +  ^‘P1 


Combining 

(1), 

(6), 

and  (9)  with  the  above 

)  yields 

(14) 

p6 

- 

F’  (l-q1)  +  P£  + 

(l-q1*). 

In  a  similar  manner 

the  other  Pj  *  a  are  obtained: 

(15) 

PN* 

m 

\  U-fl1) 

(16) 

Px* 

m 

Pt «  d-91)  +  Pn*  (l-q 

i*) 

(17) 

p% 

bO 

m 

Pq  q1  +  PJ  q1*  + 

pi  _i* 

PN*  q 

(18) 

PNS 

si 

m 

PNX  o  +  Pfcq2  +  PNj93 

+  .  .  .  +  P^q1"1 

(19) 

■ 

PNS 

so 

(10) 

PTS 

ao 

m 

(21) 

PT 

TSi 

U 

Pi  q  +  P«  q2  +  .  . 
TI  TI 

*  pt  -l*-l 
•  +  Ptx9 

(22) 

prx 

m 

PT 

TSo 

(23) 

P«s 

ao 

■ 

p^q1 
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U4) 


+  .  ,  , 


(25) 

By  definition 


(26)  Pi 

•  S 


Then,  from  (17)  and  (18)  we  obtain 


(27)  P*s 

Similarly , 

(28)  P^ 

<29>  PRS 


PNX  +  +  Pn/  +  *  •  ‘  +  Pi/"1 

(1_<11)  /p  • 

P^  ( 1— q1*)  / p 

P'  (1/p)  • 

KI 


A. 4  EXPRESSIONS  FOR  THE  STEADY  STATE  AMD  STATE  ENTRANCE  PROBABILITIES  IN 
TERMS  OF  KNOWN  PARAMETERS. 

A. 4.]  Equations  (14),  (15),  (16),  (19),  (22),  (25),  (27),  (28),  (29)  define 
nine  equations  in  nine  unknown  entrance  probabilities.  These  equations  and  their 
associated  steady  state  probabilities  may  be  expressed  in  terms  of  parameters  p, 
q,  i,  i*,  and  f j ,  which  are  assumed  known.  This  section  discusses  the  derivation 
of  such  expressions. 

A. 4. 2  In  lieu  of  solving  explicitly  for  each  Pj  and  P j ,  it  was  convenient  to 
first  express  each  state  entrance  probability,  P  j ,  in  terms  of  P^. 
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Equations  (14)  through  (29)  were  then  used  to  obtain! 


(14') 

P0 

- 

[q1  +  (l-q1)  dWl/q1 

(15') 

PN* 

m 

%  a-q1) 

(16') 

pt* 

m 

%  d-q1)  d-q1*)^1* 

(22') 

P*I 

m 

Pnj  dV)  d-q1*) 

(25') 

PRi 

m 

V1 

(27  ’) 

P»s 

m 

d-q1)/? 

(28') 

pi 

Ts 

m 

P^  (l-q1)  (l-q1*)2/? 

(29’) 

Pi 

Rs 

- 

When  the  preceding  primed  equations  are  substituted  Into  equations  (1),  (12), 
(li),  (6),  (9),  (4),  (7),  and  (10)  respectively,  the  following  steady  state 
probability  equations  are  obtained  in  terms  of  Ppj  : 


d’) 

P0  “ 

P*I 

[q1  +  (l-q 

b  (l-q1*)2]  (l-q1)/?  q1 

(3’) 

pNi  - 

(l-q^/p 

(4') 

PNS  ' 

\ 

[(l-q1)/?] 

[(l/fN) 

-l] 

(6') 

P^I  “ 

(l-q1)  (1- 

q1*)2/? 

(7') 

P*S  ■ 

PK 

[ ( l-q1)  (1 

-q1*) 2 1 

[d/fT)-i]/p 
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(9') 

?RI 

* 

p;  q1/? 

(10') 

P*s 

m 

Pnj.  q1  [(i/£r)-U/p 

(11') 

P?* 

- 

P^.  (l-q1)  (l^V/q1*  P 

(12') 

Pm# 

- 

pj,  (l-q1)  (l-q1*)/? 

j  «•  i,  equations  (l1 
combined  to  obtain 

).  (3’).  (4*).  (61),  O'), 

(30) 

P  1 
% 

» 

p  q1*  qx/D  ; 

D  -  q1*  (l-q1)  tq1  +  Cl-q1)  U-q1*)2!  +  q1  " 

qi  (l-q1)  (l-q1*)2  +  q1*  q*  (l-q1)/*!*  +  9U  ^ ^  + 
q1  q1*  (l-q1)  (l-q1*)2/^. 

Expressions  for  the  steady  state  probabilities  in  terns  of  known  parameters  can 
now  be  obtained  by  substituting  equation  (30)  into  the  primed  number  equations. 


d") 

nd 

O 

h 

q1*  (l-q1)  [q1  +  (l-q1)  (l-q 1*)2l/» 

(3") 

p»x 

q1*  q1  (l-q1) /D 

(4") 

Pns  ‘ 

q1*  q1  (l-q1)  l (l/f„)-l]/D 

(6") 

PTi  " 

q1*  q1  (l-q1)  (l-q1*)2/D 

(7") 

PT 

q1*  q1  (l-q1)  (l-q1*)2 ( a/tr)-l\/V 

(9M) 

P*I  - 

q21  q^/D 

L 


(10") 


\  m  q21  q1*  t(i/fR)-i]/B 

(11")  PT*  -  q1  (1-q1)  (l-q^/D 

(12")  PN#  -  q1*  q1  (l-ql)  <i-qi*)/D 
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APPENDIX  B 


DERIVATION  OF  AFI  AND  AOQ 


B. 1  THE  AFI  FUNCTION 

'3.1.1  By  definition  AFI  Is  the  expected  ratio  of  the  total  number  of  units 
inspected  to  the  total  number  of  units  inspected  or  passed.  Thus,  by  letting 


(,  =  the  num’ or  of  units  passing  through  the  inspection  system  in 

state  j  ; 

“  the  number  of  units  inspected  in  state  j; 

<4  *  the  number  of  units  skipped  (passed)  in  state  j; 


‘J 


KJl  +  KJS  5 


and  K.  «  £Kj  ;  where  j  =  0,  N,  R,  T,  N*,  T*, 
we  may  write 


(1) 


AFI  =  lira 


IKJ  -  *  lhz  +  IKj< 


Kq  +  kn.  +  Kt*  +  KNi  +  KTl  +  KRl 

h*  +  kt*  +  %  +  +  \  +  %s  ♦  kts  +  kRs 

=  p0  +  -PN*  +  PNj  +  PT*  +  PTJ  +  PRi  . 
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Using  (1"),  (3"),  (4"),  (6").  (7"),  (9"),  (10"),  (11"), 
A,  expressions  for  the  AF1  in  terms  of  the  parameters  p, 
ob  tai  lied  . 


and  (12")  of  A. 4,  APPENDIX 
q,  i,  1*,  and  f ^  are 


B. 2  THE  AOQ  FUNCTION 

Dodge  and  Romig  [7.17],  have  given  expressions  for  the  AOQ  functions  under 
two  assumptions: 

Case  I:  Defective  units  are  removed  and  replaced  by  non-defective  units. 

Case  II:  Defective  units  are  removed  but  not  replaced. 

Case  two  (II)  is  consistent  with  standard  operating  procedure  in  most 
ammunition  inspection  situations.  Accordingly,  appealing  to  the  Dodge  and  Romig 
expression,  we  used  the  following: 


(1)  AOQ  - 


p[l  -  AFI]  ;  where 
1  -  p(AFI) 


AFI  is  as  defined  by  (1) 


of  B.l  and  p  is  the  probability  of  a  defective  unit. 
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APPENDIX  C 


DERIVATION  OF  O.C. 

Hy  definition  the  fraction  of  product  accepted  on  a  sampling  basis  is  the 
t  xpected  ratio  of  the  number  of  units  accepted  on  a  sampling  basis  to  the  total 
number  of  units  Inspected  or  passed.  Now,  recalling  the  assumption  that  defec¬ 
tive  units  are  removed  but  not  replaced,  the  number  of  units  accepted  on  a  sampling 
basis  must  obviously  consist  of  only  those  units  inspected  and  found  non-defective 
plus  those  units  passed  (skipped  and  therefore  accepted)  in  the  sampling  inspection 
states.  Thus,  appealing  to  the  notation  of  A.l  of  APPENDIX  A,  and  B.l  of  APPENDIX  b, 
we  write  for  J  ■  N,  R,  and  T 


Uhs  +  qKji  ) 

(1)  O.C.  (X)  -  lira  J  _ x  100 

K 

“  |^PNS  +  PTS  +  PRS  +  (>PN1  +  C«PTI  +  qPRj  X  100, 

I  ■■  “ 

Using  the  equations  of  A. 2  and  A. 3  of  APPENDIX  A,  it  can  be  shown  that 

PNS  -  PNj  il. 

pRg  *  PRX  t(l/fR)-11,  and 

pTg  ‘  PTX  t(l/fT)"1J  * 

Therefore,  equation  (1)  above  can  be  written  as 

(2)  O.C.  (X)  -  100  {[(l/fN)-l  +  q]  PNl  +  [(l/fR)-l  +  q]PRl  + 

[ (l/fT)-l  +  q]  PT  > 

100  {[(l/fN)-p]  PN  +  [d/fR)-p]  PRl  + 

[d/fT)-p]  PT1)  • 
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AN  EVALUATION  OF  LINEAR  LEAST  SQUARES  COMPUTER  PROGRAMS : 
A  SUMMARY  REPORT 


Roy  H.  Wampler 

National  Bureau  of  Standards 
Washington,  D .  C. 

ABSTRACT.  Two  linear  least  squares  test  problems  based  on  fifth 
degree  polynomials  have  been  run  on  more  than  twenty  different  computer 
programs  in  order  to  assess  their  numerical  accuracy.  Among  the  programs 
tested  were  representatives  from  various  statistical  packages  as  well  as 
some  from  the  SHARE  library.  Essentially  four  different  algorithms  were 
used  in  the  various  programs  to  obtain  the  coefficients  of  the  least 
squares  fits.  The  tests  were  run  on  several  different  computers,  in 
double  precision  as  well  as  single  precision.  By  comparing  the  coef¬ 
ficients  reported,  .it  was  found  that  those  programs  using  orthogonal 
Householder  transformations  or  Gram-Sc.hmidt  orthonormalization  were  much 
more  accurate  than  those  using  elimination  algorithms.  Programs  using 
orthogonal  polynomials  (suitable  only  for  polynomial  fits)  also  proved 
to  be  superior  to  those  using  elimination  algorithms.  The  most  successful 
programs  accumulated  inner  products  in  double  precision  and  made  use  of 
iterative  refinement  procedures.  In  a  number  of  programs,  the  coefficients 
reported  in  one  test  problem  were  sometimes  completely  erroneous,  containing 
not  even  one  correct  significant  digit. 

1.  INTRODUCTION .  Since  the  time  when  the  electronic  computer  began 
to  supplant  the  desk  calculator  as  the  chief  tool  for  solving  linear  least 
squares  problems,  numerous  least  squares  computer  programs  have  been  written. 
These  programs  have  utilized  a  variety  of  computational,  algorithms.  Be¬ 
cause  least  squares  problems  are  by  their  very  nature  frequently  ill-' 
conditioned,  the  numerical  accuracy  achieved  by  a  least  squares  program 
stfongly  depends  upon  the  choicd  of  the  algorithm.  Many  programs  have 
been  written  which  use  methods  appropriate  for  desk  calculators  but  in¬ 
appropriate  for  computers.  Anscotnbe  [1]  has  aptly  remarked:  "Textbooks 
of  statistical  method  display  a  wonderful  unanimity  in  recommending  com¬ 
putational  procedures  that  are  suited  to  desk  calculators  but  are  perilous 
for  computers.  Only  with  some  determination  can  the  statistician  break 
himself  of  bad  habits  and  become  adequately  informed  about  round-off  error." 

The  present,  study  was  undertaken  to  assess  the  numerical  accuracy 
of  representative  least  squares  programs  from  a  variety  uf  sources.  Two 
test  problems,  both  fifth  degree  polynomials,  have  been  run  on  more  than 
twenty  different  programs.  Included  in  tin-  study  were  programs  from  the 
BMD  Biomedical  Computer  Programs  collection  [14],  the  C-E-I-R  Multi-Access 
Computing  Services  library  [10],  the  IBM  SHARE  library  [23],  the  IBM  System/360 
Scientific  Subroutine  Package  [22],  the  Univac  MATH-PACK  [33]  and  STAT- 
PACK  [34]  collections,  and  the  Project  MAC  7094  disk  files  [28].  A  listing 
of  the  iource3  of  the  programs  is  given  in  Appendix  A,  together  with  a  brief 
description  of  each  program. 
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precision  as  well  as  in  single  precision, 
certain  changes  in  the  original  programs. 


V.  1  „  —  _ 

W4.CIIIO 


This, 


W  v i*  c  i'uu  Ln  uuuulc 

of  course,  necessitated 


The  programs  included  in  this  study  used  essentially  four  different 
algorithms:  orthogonal  Householder  transformations;  Gram-Schmidt  ortho- 
normalization;  orthogonal  polynomials;  and,  Gaussian  or  Jordan  elimination. 


The  linear  least  squares  problem  may  be  briefly  stated  as  follows: 

One  has  n  observations  or  measurements  of  a  "dependent"  variable  y,  which 
are  statistically  independent  with  common  variance  o^,  whose  expected 
values  are  given  by  a  linear  function  of  the  corresponding  values  of  k 
independent"  variables,  x^,  x„,  ...,  x.  ,  k  n.  In  matrix  notation  we 
say  that  the  n  observations  have  expected  values  E(Y)  =  XB  ,  where  Y  is  an 
n  x  1  vector,  X  is  an  n  x  k  matrix,  and  £  is  a  k  x  1  vector  of  unknown 
coefficients.  Assuming  that  X  is  of  rank  k,  the  least  squares  estimates 
of  the  coefficients  are  given  by  3=  (X’X)-1X'Y.  Other  quantities  of 
interest  are  Y  *  X  3  »  the  vector  of  predicted  values;  6  =  Y  -  Y,  the 
vector  of  residuals;  and  s^  ■  6' 5  /(n-k),  an  estimate  of  the  variance 


In  running  certain  programs,  modifications  were  occasionally  made  to 
input  and  output  formats.  Other  changes  were  made  in  five  of  the  programs 
using  elimination  algorithms  because  the  original  versions  of  these  programs 
failed  to  give  solutions  to  the  fifth  degree  polynomial  problems.  The 
nature  of  these  changes  will  be  described  in  the  discussion  of  the  individual 
programs  in  Section  7. 


Three  computers  were  used:  the  GE  235,  the  IBM  7094,  and  the  Univac 
1108.  The  1108  which  was  used  is  located  at  the  National  Bureau  of  Standards, 
and  the  7094  which  was  chiefl  used  is  located  at  Harry  Diamond  Laboratories, 
Washington,  D.  C.  The  programs  run  on  the  235  and  the  Project  MAC  7094 
utilized  consoles  at  the  National  Bureau  of  Standards  connected  to  computers 
at  other  loeatiens. 

Previous  studies  appraising  linear  least  squares  programs  and  comparing 
the  results  of  different  algorithms  have  been  made  by  Cameron  [9],  Freund  [18], 
Bright  and  Dawkins  [7],  Zellner  and  Thornber  [38],  Longley  [25],  and  Jordan 
[24].  The  present  study  differs  from  the  earlier  ones  mainly  by  including  a 
larger  selection  of  widely  used  and  easily  accessible  programs. 

A  more  detailed  report  of  the  present  study  is  given  in  Wampler  [36], 

The  more  detailed  version  contains  an  appendix  giving  the  individual  coef¬ 
ficients  obtained  in  running  each  program,  an  investigation  into  the  effect 
of  rounded  input  on  the  solution  of  a  least  squares  problem,  additional  de¬ 
tails  pertaining  to  certain  programs,  and  results  from  some  additional  test 
problems.  The  longer  report  also  includes  several  programs  designed  not 
specifically  for  solving  least  squares  problems  but  for  solving  ii  equations 
in  n  unknowns,  thus  forcing  one  to  use  X’X  and  X'Y  as  input.  Since  it  is 
well  known  that  this  is  not,  in  general,  a  good  method  for  solving  least 
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squares  problems,  these  programs  are  omitted  from  the  present  summary 
report.  (There  was  one  outstanding  exception  among  the  programs  requiring 
X'X  and  X'V  as  input.  This  was  Newman's  program,  described  in  [30],  which 
requires  integer  input  and  uses  integer  aritlimetic  and  congruential  method 
to  obtain  exac t  soLutions.)  The  present  report  gives  results  of  one  pro¬ 
gram  (BJORCK-GOLUB)  not  Included  in  the  more  detailed  report. 

It  was  outside  the  scope  of  the  present  study  to  make  a  detailed 
comparison  of  algoritlims  with  respect  to  efficiency  of  computation  time 
and  storage  requirements.  The  programs  which  were  included  in  this  study 
exhibited  considerable  variation  in  what  quantities  were  calculated  as 
well  as  in  the  methods  of  calculation,  and  output  ranged  from  meager  to 
copious.  Moreover,  no  comparative  examination  of  the  outputs  provided 
by  the  programs  was  made.  Rather,  this  investigation  focused  attention 
on  the  performance  of  existing  programs. 

2.  THE  TEST  PROBLEMS .  The  two  test  problems  which  were  used 

throughout  this  investigation  are  identified  as  Yl  and  Y2.  Both  were 
fifth  degree  polynomials,  with  the  values  of  x  being  the  integers  0,  1, 

2,  ...,  20.  The  "observations,"  Y1  and  Y2,  were  calculated  from  the  fol¬ 
lowing  equations: 

Yl:  y  =*  1  +  x  +  x2  +  x3  +  x4  +  x5,  x  «  0(1)20 

Y2:  y  *  1  +  .1  x  +  .01  x2  +  .001  x3  +  .OOOlx4  +  .00001  x5,  x  *  0(1)20 

Thus,  the  values  of  Yl  were  integers  having  from  one  to  seven  digits,  and 
those  of  Y2  were  five-decimal  numbers  ranging 'from  1.00000  to  63.00000. 

If  the  least  squares  solutions  were  computed  with  no  rounding  error, 
one  would  obtain 


1. 

.1 

.01 

8(Y2)  =  .001  > 

.0001 
.00001 


and  for  both  problems  the  residual  standard  deviation  would  be  zero. 

For  some  programs  the  Input  required  was  the  *jl  values  of  x  and  y. 
Other  programs  required,  in  addition,  the  powers  x  ,  x^,  x ,  and  x^  to  be 
entered  as  input.  The  input  is  listed  in  Table  5,  along  with  the  matrices 
X'X  and  X'Y  associated  with  the  test  problems. 


The  two  test  problems,  Y1  and  Y2,  were  chosen  because  they  are  so 
highly  ill-conditioned  that  some  programs  fall  to  obtain  correct  solutions 
while  other  programs  succeed  in  obtaining  reasonably  accurate  solutions. 
Polynomial  problems  were  chosen  because  polynomial  fitting  is  an  impo  cant 
type  of  linear  least  squares  problem  which  occurs  frequently  in  practice. 

The  ill-conditioning  of  the  two  test  prcbLems  can  be  described  more 
explicitly.  One  measure  of  the  condition  of  a  matrix  A  is  the  P-condition, 
defined  as 


P(A)  = 


X 

o 


where  \  is  the  numerically  largest  eigenvalue  of  A  and  p  is  the  numerically 

smallest  eigenvalue  of  A.  (See  Newman  [29,  p.  240]). 

For  A  ■  X'X,  the  6x6  matrix  associated  with  Y1  and  Y2,  the  P-condi¬ 
tion  is  4.095  x  10^3.  In  this  respect,  it  is  similar  to  the  Hilbert  matrix 
of  order  10,  whose  P-condition  is  1.603  x  10 13  (see  Pettis  and  Caslin  [16]). 
The  P-condltion  of  the  Hilbert  matrix  of  order  11  is  5.231  x  lQl^.  The 
relation  between  the  Hilbert  matrix  and  the  matrix  X'X  which  arises  in  a 
polynomial  fit  is  discussed  in  Forsythe  [17]. 


Most  of  the  programs  which  were  tested  obtained  more  accurate  solutions 
for  Y2  than  for  Yl.  If  we  let  A  denote  the  7x7  matrix 


A  - 


X'X  X'Y 
Y'X  0 


we  find  that  for  Y2,  P(A)  =  4.095  x  1013,  whereas  for  Yl,  P(A)  =  6.829  x  1013, 
indicating  that  the  system  involving  Yl  is  more  ill-conditioned  than  that 
involving  Y2. 


The  test  problem  used  by  Longley  [25J  was  also  highly  ill-conditioned. 
For  the  7x7  matrix  X’X  of  his  problem,  the  P-condition  is  2.361  x  IQ-*®. 


3.  SUMMARY  OF  THE  RESULTS.  Tables  .1  to  4  present  a  brief  summary 
of  the  main  results.  A  count,  C  ,  of  the  number  of  correct  significant 
digits  in  each  computed  coefficient  was  obtained  as  follows: 

Let  3.  (j  ■  i,  2,  ...,  6)  denote  the  "true"  value  of  the  coefficient  — 
that  is,  the  value  computed  with  no  rounding  error.  Let  .  denote  tne 
value  calculated  by  the  computer.  Then 
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The  above  approach  to  counting  the  number  of  correct  digits  in  a 
computed  value  has  been  used  by  Jordan  [24]  and  others. 

Tables  1  to  4,  in  the  columns  headed  "Average  Number  of  Correct 
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Digits"  report  C  =  —  £  C  . 


From  the  above  definition,  a  negative  count  can  occur.  For  example, 
if  B  ■  1,0,  and  8  »  136.0,  we  get  C  *  -2.130.  This  indicates  that, 

is^wrong  by  rougiily  two  orders  of  magnitude. 

For  two  programs  reported  in  Table  1,  BMD03R  run  on  the  7094  and 
DAM  run  on  the  7094,  the  count  for  several  coefficients  was  made  In  a 
different  manner.  The  BMD03R  program  printed  the  coefficients  in  a 
fixed-decimal  format,  with  five  decimals.  The  DAM  program  used  a 
floating-point  format  with  only  three  decimals  printed.  A  coefficient 
printed  as  .00010,  when  the  true  coefficient  was  .0001,  was  given  a 
count  of  2,  and  Q.1QQE01,  when  the  true  coefficient  was  1.,  was  given 
a  count  of  3.  In  such  cases  the  assigned  count  may  have  been  too  small, 
since,  the  coefficients  may  have  been  calculated  accurately  to  more  digits 
than  were  printed.  In  running  these  two  programs  on  the  1108,  the  output 
format  was  changed  so  that  eight  significant  digits  were  printed. 

Each  of  the  tables  (1  through  4)  summar  i  ?.cc  a  set  of  results  *  or  a 
particular  machine  precision.  Within  each  table  the  various  programs  art 
given  a  numerical  rank  for  each  of  the  two  test  problems,  with  rank  ] 
denoting  the  best  performance  according  to  the  count  C. 

4.  PROGRAMS  USING  ORTHOGONAL  HOUSEHOLDER  TRANSFORMATIONS.  LSTSQ  is 
a  program  written  by  Peter  A.  Busingor  using  orthogonal  Householder  trans¬ 
formations.  This  «-.lgor  Itlini  is  described  bv  Golub  IIP1,  and  Bus inger  and 
Golub  [8],  The  program  applies  a  sequence  of  orthogonal  transformations 
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to  the  n  x  k  least  squares  matrix  X  to  obtain  a  decomposition  X  »  OR. 
wnere  K  is  upper  triangular  and  Q'Q  =  1^.  A  pivoting  strategy  is  used 
so  that  at  each  step  the  column  with  the  largest  sum  of  squares  is  re* 
duced  next.  On.p  nn  init:..1  solution  in  obtained,  the  program  Iterates 
to  obtain  a  (possibly)  improved  solucion. 

The  BJORCK-GGLUB  program  uses  the  Householder  transformation 
algorithm  described  by  Bjorck  and  Golub  [6].  This  algorithm  takes 
advantage  of  the  fact  that  X ’ 6  =  0,  where  6  is  the  vector  of  residuals, 
to  obtain  the  solution  fc>  in  XB  -  Y  from  the  augmented  system  of  n  +  k 
equations : 


"i  x" 

V 

~Y  " 

* 

•st 

X'  0 

0 

Here  6  as  well  as  3  is  included  in  the  iterative  refinement  procedure. 

Of  all  the  programs  included  in  this  study,  LSTSQ  and  BJ0RCK-G0LUB 
appear  to  have  given  the  best  performance.  In  Table  3,  which  reports 
the  performance  of  eleven  double  precision  programs,  we  see  that  LSTSQ 
ranked  first  for  Y1  and  second  for  Y2,  and  that  BJ0RCK-G0LUB  ranked 
first  for  Y2  and  second  for  Yl.  In  Table  1,  which  reports  the  performance 
of  20  single  precision  programs,  we  see  that  LSTSQ  ranked  first  for  Yl  and 
fourth  for  Y2,  and  that  BJORCK-gOLUB  ranked  second  for  Yl  and  third  for 
Y2.  Ranks  1  and  2  for  the  Y2  problem  were  obtained  by  0RTH0L  and  OMNITAB 
(using  ORTHO),  two  programs  using  Gram-Schmidt  orthonormalization  which 
will  be  discussed  in  the  next  section.  Table  4  reports  the  performance  of 
four  programs  which  used  single  precision  arithmetic  except  for  the  ac¬ 
cumulation  of  inner  products,  where  double  precision  arithmetic  was  used. 
Here  we  see  that  LSTSQ  and  BJORCK-GOLUB  tied  to  obtain  the  top  rank  for 
Yl  (having  perfect  scores  of  8.000),  but  ranked  third  and  fourth,  respec¬ 
tively,  for  Y2.  In  Table  4,  we  note  that  all  four  programs  obtained 
similar  scores  for  the  Y2  problem,  with  rank  1  corresponding  to  6.530 
and  rank  4  to  6.227.  In  the  Businger-Golub  and  Bjorck-Golub  algorithms, 
it  is  recommended  that  all  inner  products  be  accumulated  in  double  pre¬ 
cision.  By  comparing  Tables  4  and  1  we  see  that  when  LSTSQ  included  this 
feature,  the  average  counts  increased  from  4.528  to  8.000  for  Yl  and  from 
5.840  to  6.279  for  Y2.  With  all  operations  performed  in  double  precision 
(see  Table  3),  the  counts  increased  to  14.643  and  16.293,  respectively. 

The  BJORCK-GOLUB  program  displayed  similar  improvements  in  accuracy  when 
inner  products  were  accumulated  in  double  precision  and  when  all  opera¬ 
tions  were  carried  out  in  double  precision. 

Another  program  using  Householder  transformations  was  ALSQ,  a 
program  containing  no  pivoting  and  no  iteration.  In  Tables  1,  3,  and  4 
we  see  that  ALSQ  performed  not  quite  as  well  as  the  LSTSQ  and  BJORCK- 
GOLUB  programs  which  included  these  features,  except  in  one  instance.  In 
this  one  instance,  Y2  in  Table  4,  we  note  that  its  performance  was  slightly 
better  than  that  of  the  other  programs  in  this  category. 
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5  PROGRAMf,  i:3ING  GnAh-auini DT  ORTHONORMALIZATION .  ORTHO  is  a 
program  written  by  Philip  J.  Walsh  using  a  Gram-Schmidt  orthonormalizdtion 
process.  This  algorithm  is  described  by  Davis  and  Rabinowitz  [13],  Davis 
[12],  and  Walsh  [35].  ORTHO  exists  as  a  FORTRAN  program,  an  ALGOL  pro- 
eeuure,  a  BASIC  program,  and  as  a  routine  of  the  OMNITAB  program  [21]. 

Starting  with  the  n  x  k  matrix  X,  the  Gram-Schmidt  process  of  ORTHO 
obtains  <p  *  XT’”!  and  3  *  T'-l  4> ' Y,  where  T’-1  Is  upper  triangular  and  Ifc. 

This  algorithm  includes  a  feature  of  re£rthonormalizing  the  vectors  of  4, 
proceeding  from  a  first  approximation  “j  to  a  (usually)  better  approxima¬ 
tion  4>j .  From  Table  1  it  is  clear  that  this  reorthonormalizing  is  vital 
to  the  algorithm,  for  ORTHO's  good  performance  in  handling  Y1  and  Y2 
deteriorated  when  this  iteration  was  omitted,  For  Yl,  the  count  of 
correct  digits  dropped  from  4.137  to  -1.976,  and  for  Y2  the  drop  was  from 
5.464  to  0.419.  In  Table  3,  «lso,  we  see  that  in  double  precision  the 
omission  of  the  iteration  resulted  In  a  loss  of  about  five  correct  digits 
for  both  problems. 

Of  the  six  programs  in  Table  2,  LSFITW***,  written  in  BASIC,  ranked 
first  on  both  problems.  We  note  that  Table  2  includes  no  Householder 
transformation  programs. 

The  ORTHO  program  was  also  run  in  a  version  using  single  precision 
except  for  the  accumulation  of  inner  products,  where  double  precision 
was  used.  In  Table  4  we  see  that  there  were  four  programs  in  this  category, 
and  ORTHO  ranked  third  for  Yl  and  second  for  Y2. 

ORTROL  is  a  program  using  a  modification  of  the  Davis -Rab inowi t  z 
algorithm.  It  differs  from  ORTHO  in  two  respects:'  (1)  the  iteration 
procedure  Includes  the  dependent  variable  as  well  as  the  independent 
variables;  and,  (2)  before  any  other  operations  are  applied  to  the 
matrix  X,  from  each  element  of  each  vector  of  X,  the  truncated  mean  of 
that  vector  is  subtracted.  (The  "truncated  mean"  denotes  the  largest 
integer  less  than  or  equal  to  the  mean  if  the  mean  is  nonnegative,  and 
the  smallest  integer  greater  than  or  equal  to  the  mean  if  the  mean  is 
negative.)  ORTHOL  obtained  the  top  rank  for  Y2  in  single  precision,  but 
ranked  sixth  for  Yl  (Table  1).  In  double  precision  (Table  3),  it  ranked 
third  on  both  problems. 

6.  PROGRAMS  USING  ORTHOGONAL  POLYNOMIALS.  Since  the  two  test  problems 
are  both  polynomial  fits,  we  were  able  to  test  programs  in  which  the 
algorithm  used  orthogonal  polynomials.  This  method,  described  by  Forsythe 
[17],  is  attractive  because  it  generally  requires  many  fewer  operations 
than  other  methods. 

Two  such  programs  v,ere  included  in  this  study.  One  was  the  UNIVAC 
1108  MATH-PACK  ORTHLS  routine  [33].  The  other  was  POLFIT,  an  anonymous 
program  written  in  BASIC. 


109 


In  Tables  1,  2,  and  3,  we  see  that  the  performance  of  the  orthogonal 
polynomial  programs  is  not  as  good  as  that  ot  the  Householder  transfor¬ 
mation  and  the  Gram-Schmidt  programs  (with  iteration) ,  but  the  performance 
is  better  than  that  of  any  of  the  programs  using  elimination  algorithms. 

7.  PROGRAMS  USING  ELIMINATION  ALGORITHMS.  The  majority  of  the 
programs  tested  in  this  investigation  used  some  form  of  an  elimination 
algorithm.  Although  this  was  the  most  popular  method,  it  was  the  least 
successful.  None  of  these  programs  performed  as  well  as  those  using 
Householder's  transformations,  Gram-Schmidt  orthonormalization  (with 
iteration),  or  orthogonal  polynomials. 

Within  this  class  of  programs,  there  were  several  variations  in 
the  method  of  obtaining  the  least-squares  coefficients.  In  some  cases, 
the  matrix  X'X  was  Inverted,  after  which  the  Inverse  was  postmultlplied 
by  X'Y.  One  program  inverted  the  matrix  Z'Z  where  the  vectors  of  Z  were 
obtained  from  the  vectors  of  X  by  subtracting  the  mean  of  each  vector  from 
every  element  of  that  vector.  A  number  of  programs  obtained  the  solution 
by  inverting  a  matrix  of  correlation  coefficients.  The  five  stepwise 
regression  programs  made  use  of  matrix  partitioning  in  connection  with 
Inverting  a  matrix  of  correlation  coefficients. 

The  five  stepwise  regression  programs  were  BMD02R,  MPR3,  the  STAT- 
PACK  program  RE3TEM,  WRAP,  and  STAT20***.  They  all,  to  a  greater  or 
lesser  extent,  follow  Efroymson's  algorithm  [15].  Tables  1,  2,  and  3 
give  the  results  of  these  five  programs. 

In  running  the  two  test  problems  on  three  of  the  stepwise  programs, 
namely,  BMD02R,  RESTEM  and  STAT20***,  calculations  stopped  before  the 
solutions  were  obtained.  These  programs  at  various  steps  calculate  an 
F-level  in  connection  with  entering  or  removing  variables,  and  a  point 
was  reached  where  this  F-level  was  calculated  to  be  negative  because  of 
rounding  error.  Since  this  condition  caused  the  calculations  to  stop, 
certain  steps  of  the  algorithm  had  to  be  bypassed  to  obtain  the  final 
solution.  These  steps  were  not,  however,  connected  with  the  calculation 
of  the  least  squares  coefficients. 

WRAP,  the  program  with  the  lowest  rankings  in  Table  1,  computed 
coefficients  which  were  exceptionally  far  from  the  true  values.  These 
coefficients  are  listed  below. 
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Two  other  BMD  programs,  in  addition  to  BMD02R  mentioned  earlier,  were 
tested.  These  were  BMD03R,  Multiple  Regression  with  Case  Combinations, 
which  inverts  a  matrix  of  correlation  coefficients,  and  BMD05R,  Polynomial 
Regression,  which  inverts  the  matrix  Z'Z  where  the  vectors  of  Z  are  formed 
from  the  vectors  of  X  by  subtracting  the  mean  of  each  vector  from  every 
element  of  that  vector.  All  the  crucial  operations  of  BMD05R,  such  as 
the  forming  of  inner  products  and  matrix  inversion,  are  carried  out  in 
double  precision.  The  performance  of  BMD03R  and  BMD05R  is  shown  in 
Tables  1  and  3,  respectively. 

DAM  is  a  general-purpose  computer  program  for  data  processing  and 
multiple  regression  [31J.  In  running  the  two  test  problems  on  DAM  on 
the  1108,  computations  stopped  after  a  fourth  degree  polynomial  was 
fitted.  It  was  found  that  a  computed  variance  was  zero  and  that  this 
condition  causes  the  computations  to  stop.  By  bypassing  the  checks  on 
this  computed  variance,  results  for  fifth  degree  fits  were  obtained. 

On  the  7094,  however,  the  fifth  degree  results  were  reached  without  any 
such  difficulties.  DAM's  performance  on  the  two  computers  is  given  in 
Table  1. 

The  program  POLRG  is  the  polynomial  regression  program  of  the  IBM 
System/360  Scientific  Subroutine  Package  [22],  We  see  from  Table  1 
that  the  single  precision  version  of  POLRG  obtained  rather  low  scores 
on  both  test  problems.  A  double  precision  version  of  POLRG  was  also 
run,  and  the  performance  here  as  reported  in  Table  3  was  comparable  to 
other  programs  using  similar  elimination  algorithms. 

The  user  of  POLRG  specifies  m,  the  highest  degree  polynomial  to  be 
fitted,  and  the  program  automatically  reports  the  results  of  fitting 
polynomials  of  successively  increasing  degrees,  starting  with  the  first 
degree.  If  there  is  no  reduction  in  the  residual  sum  of  squares  between 
two  successive  degrees  of  polynomials,  the  program  stops  the  problem 
before  completing  the  analysis  for  the  highest  degree  specified.  In 
running  both  test  problems  in  single  precision  the  analysis  stopped 
after  degree  four,  and  in  lieu  of  a  fifth  degree  polynomial  fit,  the 
message  ”N0  IMPROVEMENT”  was  printed.  In  order  to  complete  the  calcula¬ 
tions  for  the  fifth  degree,  the  checks  on  "improvement”  were  bypassed. 

In  the  double  precision  version,  fifth  degree  results  were  obtained 
without  any  such  alterations. 
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Each  of  the  two  STAT-PACK  programs,  GLH,  General  Linear  Hypotheses, 
and  REBSOM,  Back  Solution  Multiple  Regression,  has  its  individual  features, 
LuL  Z ul  the  lwo  test  problems  tne  solutions  were  carried  out  in  the  same 
manner,  so  that  the  coefficients  obtained  from  the  two  programs  were 
Identical,  as  is  indicated  in  Table  1.  Both  programs  invert  X'X  by  calling 
the  same  matrix  Inversion  subroutine  which  uses  a  Gauas-Jordan  elimination 
scheme  with  maximal  column  pivoting  and  row  scaling. 

The  BASIC  program  LINF1T***  in  order  to  obtain  B  inverts  the  matrix 


A  ■ 


X'X 

Y'X 


whose  inverse,  if  it  exists,  is 
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When  Y  -  Y,  the  matrix  A  is  singular.  In  the  two  test  problems  Y  «  Y, 
so  that  the  matrix  A,  if  it  were  formed  in  the  computer  without  any 
rounding  error,  would  be  singular.  But  A,  for  Y1  and  Y2,  contains  14- 
digit  numbers,  whereas  the  235  computer  works  with  approximately  nine- 
digit  numbers,  so  that  rounding  of  the  elements  of  A  is  inevitable,  and 
the  version  of  A  contained  in  the  computer  is  not  singular.  An  "Inverse" 
was  obtained,  and  from  this  8  was  immediately  computed.  Table  2  gives 
the  results. 


LSCF — ***  and  STAT21***  are  two  BASIC  programs  available  in  the 
C-E-I-R  Multi-Access  Computer  Service;  results  are  given  in  Table  2. 
LSCF — ***,  which  obtains  the  coefficients  by  inverting  X'X  and  then 
post-multiplying  the  Inverse  by  X'Y,  had  the  lowest  rankings  of  Table  2. 
STAT21***  obtains  (X'X)~1  and  B  by  applying  Jordan  elimination  to  X'X 
and  X'Y. 


The  UNFIT  program  included  in  Table  1  is  one  of  eighteen  statistical 
routines  described  by  Miller  [28]  which  exist  in  the  Project  MAC*  7094  disk 
files.  The  two  test  problems  were  run  on  the  LINFIT  program  on  a  time- 


*A  description  of  Project  MAC  is  given  in  Crisman  [11]. 
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shared  computer  via  a  c  c.'  n  c  d  c  c  uuum  ialcsLiuts  with  Project  MAC. 

The  method  used  by  the  LINFIT  program  is  not  given.  By  conjecture,  it 
has  been  included  in  this  section  among  programs  using  elimination 
algorithms . 

8.  OTHER  RECENT  ALGORITHMS ■  Some  other  algorithms  apparently  of 
high  quality  which  have  been  published  in  the  last  few  years  were  not 
included  in  this  study.  Two  such  algorithms  are  given  by  Bauer  [2]  and 
Bjorck  [5]. 

Bauer  [2]  gives  an  ALGOL  procedure  using  iterative  refinement  for 
finding  the  least  squares  solution  of  X8  *  Y,  where  X  is  n  x  k  (k  j?  n) 
of  rank  k  and  Y  is  n  x  p.  The  procedure  is  based  on  the  decomposition 
of  X  into  UDR  where  U  is  n  x  k  with  orthogonal  columns,  D  *  (U’U)-^,  and 
R.  is  upper  triangular.  This  decomposition  yields  a  triangular  system 
RS  ■  U'Y  which  is  solved  by  back  substitution.  The  reduction  to 
R6  «=  U'Y  is  carried  out  by  a  Gaussian  elimination  scheme,  but  with  a 
suitably  weighted  combination  of  rows  used  for  elimination  instead  of 
a  single  row. 

Bjorck's  algorithm  [5]  (see  also  Bj'orck  [3],  [4])  using  a  modified 
Grarn-Schmidt  orthogonalization  process,  has  certain  features  in  common 
with  the  Bjorck-Golub  algorithm  discussed  in  Section  4  above.  Two  such 
features  are  solving  the  system  of  n  +  k  equations 


to  obtain  g  and  {  ,  and  inclusion  of  6  as  well  as  s  in  the  Iterative 
refinement  procedure. 

Both  the  classical  Gram-Schmidt  orthogonalization  process  and  the 
modified  Gram-Schmidt  orthogonalization  process,  as  described  by  Bjorck 
[3],  decompose  the  matrix  X  into  QR  where  Q * Q  is  diagonal  and  R  is  upper 
triangular.  In  the  classical  procedure,  at  the  i-th  stage,  the  i-th 
column  vector  is  made  orthogonal  to  each  of  the  i  -  1  previously  ortho- 
gonalized  column  vectors;  this  is  done  for  column  indices  1*2,  3,  ...,  k. 
In  the  modified  procedure  which  Bjorck  uses,  at  the  i-th  stage,  the 
(k  -  i  +  1)  column  vectors  indexed  i,  i  +  1,  ...,  k  are  made  orthogonal 
to  the  (i  -  l)-th  column  vector;  this  is  done  for  column  indices  i  •*  2,  3, 
...,  k.  Jordan  [24]  shows  why  the  modified  procedure  is  superior  to  the 
classical  procedure.  Bjorck  [3]  states  that  his  modified  Gram-Schmidt 
procedure  is  equivalent  to  Bauer's  method  using  weighted  row  combinations 
mentioned  above.  Bjorck's  algorithm  is  generalized  to  handle  the  case 
where  X  is  of  less  than  full  rank;  here,  linear  constraints  are  entered. 
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requirements  of  his  algorithm,  and  he  compares  the  number  of  operations 
needed  with  the  corresponding  number  needed  in  the  Bjorck-Golub  algorithm 
[6]. 


9-  CONCLUSIONS. 

(1)  Computational  procedures  appropriate  for  desk  calculators 
may  be  perilous  for  computers. 

(2)  Of  the  four  procedures  which  were  included  in  this  study, 
orthogonal  Householder  transformations  and  Gram-Schmidt  orthonormaliza¬ 
tion  proved  to  be  the  best.  Orthogonal  polynomials  ranked  next.  Elimina¬ 
tion  methods  were  the  least  successful  but  the  most  popular. 

(3)  Programmers  who  have  been  writing  least  squares  programs, 
especially  for  statistical  packages,  have  often  not  been  taking  advantage 
of  the  advances  in  this  arek  made  by  numerical  analysts  in  recent  years. 

(4)  The  importance  of  accumulating  Inner  products  in  double  precision 
cannot  be  overstressed.  A  number  of  recent  papers  on  least  squares  computa¬ 
tions  have  emphasized  thita  point.  Theee  include  Businger  and  Golub  [8] , 
Bauer  [2],  Golub  and  Wilkinson  [20],  Bjorck  and  Golub  [6],  and  Bj6*rck  [5]. 

On  many  third-generation/  computers  which  have  double  precision  built  into 
the  hardware,  double  precision  arithmetic  is  quite  efficient. 

(5)  Iterative  refinement  is  another  valuable  feature  of  recent 
algorithms.  Five  programs  included  in  the  present  study  (BJORCK-GOLUB, 
LSFITW***,  LSTSQ,  ORTHO  and  ORTHOL)  made  effective  use  of  iterative  re¬ 
finement,  and  the  two  algorithms  described  in  Section  8  both  include 
this  feature.  Golub  and  Wilkinson  [20]  give  a  discussion  of  this  topic. 

(6)  The  users  of  least  squares  programs  can  take  certain  pre¬ 
cautionary  steps  to  gain  an  awareness  of  whether  or  not  a  rounding  error 
problem  exists.  Among  the  suggestions  which  have  been  made  here  are  the 
following: 


(a)  Run  teat  problems  where  the  coefficients  are  known 
(Cameron  [9]). 

(b)  Transform  the  data;  e.g.,  by  subtracting  means  (Freund 
[18],  Longley  [25]). 

(c)  Do  the  calculations  several  times,  scaled  differently  each 
time  (Zellner  and  Thornber  [38],  Longley  [25]). 

(d)  Shuffle  the  columns  of  X  and  run  the  problem  more  than 
once  (Longley  [25]). 
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(k)  Check  wnetiier  X'i  »  0  (Longley  [25]).. 

(f)  Use  double  precision  arithmetic  (Freund  [IS]). 

(g)  Follow  the  initial  fit  by  a  fit  to  Y,  the  predicted  values 
(suggested  by  J.  M.  Cameron;  see  Wampler  [36]). 

(7)  In  any  mathematical  calculation  carried  out  on  a  computer,  it  1 

is  desirable  to  know  whether  an  accurate  solution  has  been  obtained  or  5 

whether  the  result  of  a  calculation  is  contaminated  by  rounding  error 
to  such  an  extent  that  it  is  worthless.  This  goal  has  been  achieved  in 
some  areas.  Martin,  Peters,  arid  Wilkinson  [27],  in  their  paper  giving  an 
algorithm  for  solving  Ax  ■  b,  where  A  is  an  n  x  n  positive  definite  matrix, 
state  that  their  procedure  "either  produces  the  correctly  rounded  solutions 
of  the  equation  Ax  »  b  or  indicates  that  A  is  too  ill-conditioned  for  this 
to  be  achieved  without  working  to  higher  precision  (or  is  possibly  singular)."  i 

Similarly,  Wilkinson's  program  [37]  for  the  solution  of  an  ill-conditioned  } 

n  x  n  system  of  equations  Ax  ■  b,  "gives  either  a  solution  of  the  system 
which  is  correct  to  working  accuracy  or  alternatively  Indicates  that  the  1 

system  is  too  ill-conditioned  to  be  solved  without  working  to  higher  pre¬ 
cision  and  may  even  be  singular."  .  ] 

i 

It  appears  that  the  goal  set  out  above  has  now  been  achieved  in  the 
linear  least  squares  programs  of  Bjorck  and  Golub  [6]  and  Bjorck  [5],  The 
authors  state  that  their  procedures  may  be  used  to  compute  accurate  solutions 
and  residuals  to  linear  least  squares  problems,  but  that  the  procedures  will 
fail  when  X  modified  by  rounding  errors  has  less  than  full  rank,  and  that 
they  will  also  fall  if  X  is  so  ill-conditioned  that  there  Is  no  perceptible 
improvement  in  the  iterative  refinement.  The  user  is  easily  informed  of 
these  situations. 
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TABLE  1  5'MMAKX  cr  PROGRAMS  RUN  IN  SINGLE  PRECISION  -  6  Digits 

Average  Number  of 


Program 

Computer 

Algorithm* 

Corroct  Digits 
T1  12 

Rank 

n  12 

ALS4 

1108 

HT 

1**098 

5.368 

1* 

6 

BJORCK-GOLUB 

1108 

HT 

1**393 

5.950 

2 

3 

BMD02R 

1105 

E 

-0.106 

1.961 

13 

15 

3KD03R 

7091* 

E 

0.71*2 

1.721 

9 

17 

BKD03R 

1108 

E 

-0.123 

2.287 

11* 

13 

DAM 

70 9h 

E 

1.389 

2.312 

6 

12 

DAM 

1108 

E 

-0.261* 

2.622 

17 

10 

LINFIT  (Miller) 

7091* 

? 

-2.756 

-0.301 

19 

19 

LSTSQ 

1108 

HT 

1**528 

5.81*0 

1 

1* 

MATH-PACK,  ORTHLS 

1108 

OP 

1 

2.118 

li.363 

7 

7 

MPR3 

7091* 

£ 

-0.11*0 

1.856 

15 

16 

OMNITAB  (Ortho) 

7091* 

GS 

3.951* 

5.968 

5 

2 

OMNITAB  ( Ortho) 

1108 

GS 

U.137 

5.161* 

3 

5 

ORTHO  (no  iteration) 

1108 

GS 

-1.976 

0.1*19 

18 

18 

ORTHOL 

1108 

GS 

3.593 

6.197 

6 

1 

POLRG 

1108 

S 

-0.191 

2.280 

16 

11* 

STAT-PACK,  GLH 

1108 

E 

0.066 

2.767 

11* 

8* 

STAT-PACK,  REBSOM 

1108 

E 

0.066 

2.767 

11* 

8* 

STAT-PACK,  RESTSM 

1108 

E 

0.651 

2.1*07 

10 

11 

WRAP 

7091* 

E 

-5.300 

-2.871 

20 

20 

*E  ■  Elimination  method}  GS  -  Gram-Schmidt  orthonormalizationj  HT  -  Orthog¬ 
onal  Householder  transformational  OP  -  Orthogonal  polynomials. 
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TABLE  2  SUMMARY  OF  PROGRAMS  RUN  IN  SINGLE  PRECISION  -  9  Digits 


Average  Number  of 


Program 

Computer 

Algorithm* 

Correct  Digits 
Y1  Y2 

LINFIT*** 

235 

E 

0.905 

2.691* 

LSCF --*■** 

235 

E 

0.308 

2.1*83 

LSFITW*** 

235 

OS 

U.102 

6.351* 

POLFIT 

235 

OP 

3.31*9 

5.922 

STAT20*M“* 

235 

E 

0.612 

2.920 

ST4T21**** 

235 

E 

1.169 

3.183 

E  ■  Elimination  method;  GS  ■  Gnun-Schmidt  orthonormalization; 
OP  ■  Orthogonal  polynomials. 


Rank 
T1  Y2 

1*  5 

6  6 

1  1 

2  2 

5  1* 

3  3 
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TABLE  3  SUMMARY  OF  PROGRAMS  RUN  IN  DOUBLE  PRECISION  -  18  Digits 


Average  Number  of 


Program 

Computer 

Algorithm* 

Correct 

Y1 

Digits 

Y2 

Rank 
Y1  Y2 

ALSQ 

1108 

HT 

12.667 

15.322 

5 

5 

BJORCK-GOLUB 

1108 

HT 

13.580 

17.057 

2 

1 

BND02R 

1108 

S 

9.645 

12.865 

7 

7 

BMD05R 

1108 

E 

9.368 

11.791 

9 

10 

LSTSQ 

1108 

HT 

14.643 

16.293 

1 

2 

MATH-PACK,  ORTHLS 

1108 

op 

12.098 

14.461 

6 

6 

GRTHO 

1108 

OS 

13.188 

15.514 

4 

4 

ORTHO  (no  iteration) 

1108 

OS 

7.963 

10.354 

11 

11 

ORTHOL 

1108 

OS 

13.212 

15.604 

3 

3 

POLRG 

1108 

E 

9.290 

11.806 

10 

9 

STAT-PACK,  RESTEM 

1108 

E 

9.494 

12.019 

8 

8 

TABLE  4  SUMMARY  OF  PROGRAMS  RUN  IN  SINGLE  PRECISION  (8  Digits)  WITH 
INNER  PRODUCTS  ACCUMULATED  IN  DOUBLE  PRECISION  (18  Digits) 


Average  Number  of 
Correct  Digits  Rank 


Program 

Computer 

Algorithm* 

Y1 

Y2 

Y1 

12 

ALSQ 

1108 

HT 

3.506 

6.530 

4 

1 

BJORCK-GOLUB 

1108 

HT 

8.000 

6.227 

1* 

4 

LSTSQ 

1108 

HT 

8.000 

6.279 

1* 

3 

ORTHO 

1108 

GS 

3.904 

6.459 

3 

2 

*E  •  Elimination  method}  GS  -  Gram-Sehmidt  orthonormaliaation;  HT  «  Orthog¬ 
onal  Householder  transformations;  OP  ■  Orthogonal  polynomials. 
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t*mTTm  t?ro  TTTfrpt  r  r»r»r*r»  vhahti  t  n 

**•*  w  *  *  '-'iw  *  i.*  iiJ  A  UUi  MUl'art.ljO 


TAT3TT?  C 


0. 

1. 

1.00000 

1. 

6. 

1.11111 

2. 

63. 

1.21*992 

3. 

361*. 

1.1*2753 

1*. 

1365. 

1.65981* 

5. 

3906. 

1.96875 

6. 

9331. 

2.38336 

7. 

19608. 

2.91*117 

8. 

371*1*9. 

3.68928 

9. 

661*30. 

U.68559 

10. 

111111. 

6.00000 

11. 

177156. 

7.71561 

12. 

2711*53. 

9.92992 

13. 

1*02231*. 

12.75603 

11*. 

579195. 

16.32381* 

15. 

813616. 

20.78125 

16. 

11181*81. 

26.29536 

17. 

1508598. 

33.05367 

18. 

2000719. 

1*1,26528 

19. 

2613660. 

51.16209 

20. 

33681*21. 

63.00000 

MATRIX  X'X  ASSOCIATED  WITH  THE  TEST  PROBLEMS 

21.  210.  2870.  1*1*100.  722666. 

210.  2870.  1*1*100.  722666.  12333300. 

2870.  1*1*100.  722666.  12333300  .  2161*55810. 

1*1*100.  722666.  12333300  .  2161*55810  .  3877286700. 

722666.  12333300.  2161*55810.  3877286700  .  7051*0730666. 

12333300  .  2161*55810.  3877286700  .  7051*0730666.  129915527991*0. 


12333300. 

2161*55810. 

3877286700. 

7051*0730666. 

129915527991*0. 

21*163571680850. 


MATRIX  X'Y  FCR  Y1 

13103167. 

229558956. 

1*10681*51*1*6. 

71*61*757321*2. 

1373802809082. 

25537373767266. 


MATRIX  X'Y  FOR  Y2 

310.39960 

5058.551*10 

87258.1*0800 

151*9291.38666 

2601*31*66.66600 

511*81*3723.1*6850 


APPENDIX  A 


SOURCES  OF  THE  PROGRAMS,  WITH  BRIEF  DESCRIPTIONS 

ALSO.  A  FORTRAN  IV  subroutine  to  solve  the  linear  least  squares  prob¬ 
lem,  written  by  0.  lr.T.  Stewart,  III,  Union  Carbide  Corp.,  Oak  Ridge, 
Tennessee  (present  address:  University  of  Texas,  Austin,  Texas).  This 

iL 

f 

p  program  uses  a  modification  of  the  Businger-Golub  algorithm  [ 8  ]. 

£: 

BJORCK -GOLUB.  A  FORTRAN  V  program  to  solve  the  linear  least  squares 
problem,  written  by  Roy  H.  Wampler,  National  Bureau  of  Standards,  using 
the  Bj ttrck -Golub  algorithm  (6  ]• 

t 

|  BMD02R,  Stepwise  Regression.  One  of  the  Biomedical  Computer  Programs, 

|  written  in  FORTRAN  [Hi]. 

BMD03R,  Multiple  Regression  with  Case  Combinations.  One  of  the 
11  r 

l 

I  Biomedical  Computer  Programs,  written  in  FORTRAN  [11*]. 

I  BKDO$R,  Polynomial  Regression.  One  of  the  Biomedical  Computer  Pro- 

(•  grams,  written  in  FORTRAN  [ll*]. 

[ 

DAM.  A  general  purpose  computer  program  for  data  processing  and 

i 

multiple  regression,  written  in  FORTRAN  by  Rudolf  R.  Rhomberg,  Lorette 
Boissonneault,  and  Leonard  Harris,  International  Monetary  Fund  [31], 
UNFIT.  A  program  which  fits  a  linear  function  to  collected  data  via 
least  squares.  Optional  constraints  may  be  applied  to  the  fitting 
coefficients  to  make  them  non-negative,  add  to  a  constant,  etc.  One 
of  eighteen  statistical  routines  written  by  James  R.  Killer  [58j« 

This  library  of  routines  exists  in  the  Project  MAC  70?U  in  the  disk 
files  of  user  number  TI69  2750. 


123 


LIHFIT***.  A  program  written  in  BASIC  for  linear  least  squares  curve 
fitting  and  computing  correlations.  Origin:  Dartmouth  College,  Hanover, 
N.  H.  Available  in  the  C-E-I-R  Multi-Access  Computer  Services 
library  [lo]. 

LSCF-- ***»  A  least  squares  polynomial  curve  fitting  subroutine  written 
in  BASIC.  Origin:  Dartmouth  College,  Hanover,  N.  H*  Available  in  the 
C-E-I-R  Multi -Access  Computer  Services  library  [10]. 

LSFITVftBHt.  A  least  squares  curve  fitting  program  written  in  3ASIC. 
Adapted  by  John  B.  Shumaker,  National  Bureau  of  Standards, from  Philip 
J.  Walsh's  ORTHO  algorithm  [$$],  Available  in  the  C-E-I-R  Multi-Access 
Computer  Services  library  [  10]. 

LSTSQ.  A  FORTRAN  IV  subroutine  which  solves  for  X  the  over determined 
system  AX  »  B  of  m  linear  equations  in  n  unknowns  for  p  right-hand 
sides.  Written  by  Peter  Businger,  Confutation  Center,  University  of 
Texas  (present  address:  Bell  Telephone  Laboratories,  Murray  Hill, 

N«  J.),  using  the  Businger-Oolub  algorithm  [8]. 

MATH-PACK.  ORTHLS.  Orthogonal  Polynomial  Least-Squares  Curve  Fitting. 

One  of  the  Univac  1108  MATH-PACK  programs,  written  in  FORTRAN  V  [33], 

MPR3.  Stepwise  Multiple  Regression  with  Variable  Transformations.  A 
FORTRAN  II  program  written  by  M.  A.  Efroymson,  Esso  Research  and 
Engineering  Co.,  Madison,  N.  J.,  using  the  Efroymson  algorithm  [l£]. 
Available  in  the  SHARE  library:  7090-02  3H5KPR3  [23]. 
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ONflJTT&B  t  a  general  purpose  computer  program  for  statistical  and 
numerical  analysis.  Developed  at  the  National  Bureau  of  Standards  by 
Joseph  Hilsenrath  et  al  [21].  Now  available  in  an  A.  S.  A.  FORTRAN 
version,  OMNITAB  allows  the  user  to  communicate  with  a  computer  In  an 
efficient  manner  by  means  of  simple  English  sentences. 

ORTHO.  A  program  written  by  Philip  J.  Walsh,  National  Bureau  of 
Standards  (present  address i  University  Computing  Co.,  East  Brunswick, 
N.  J.),  which  uses  a  Gram-Schmidt  orthonormalization  process  for  least 
squares  curve  fitting.  ORTHO  exists  as  an  ALGOL  procedure  [35],  a 
FORTRAN  program,  a  BASIC  program  (see  LSFITW***  above),  and  as  a 
routine  of  OMNITAB  [21],  where  it  is  called  by  the  commands  FIT  and 
POLYFIT. 

ORTHOL.  A  modification  of  the  Davis -Rabin owl t z  orthonormalization 
algorithm  [12],  [131,  written  in  FORTRAN  II  by  James  W.  Longley,  Bureau 
of  Labor  Statistics,  Washington,  D.  C.,  and  Roger  A.  Blau,  Bureau  of 
Labor  Statistics  and  Carnegie -Mellon  University,  Pittsburgh,  Pa.  [26]. 
POLFIT.  An  anonymous  program  written  in  BASIC  for  least  squares 
polynomial  curve  fitting  using  orthogonal  polynomials. 

POLRO,  Polynomial  Regression.  One  of  the  programs  of  the  IBM 
System/360  Scientific  Subroutine  Package  written  in  FORTRAN  IV  [22]. 
STAT-PACK,  GLH,  General  Linear  Hypotheses.  One  of  the  Univac  1108 
STAT-PACK  programs,  written  in  FORTRAN  V  [3b]. 

STAT-PACK,  R3B5QM,  Back  Solution  Multiple  Regression.  One  of  the 
Univac  1108  STAT-PACK  programs,  written  in  FORTRAN  V  [3b]. 
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STAT-PACK,  RESTEM,  Stepwise  Multiple  Regression.  One  of  the  Univae 
1108  STAT-PACK  programs,  written  in  FORTRAN  V  [3k]. 

STAT2Q***.  A  program  written  in  BASIC  for  stepwise  multiple  linear 
regression.  Written  by  Thomas  S.  Kurt*,  Dartmouth  College,  Hanover, 
R.  K.  Available  in  the  C-E-I-R  Multi-Access  Computer  Services 
library  [10 J. 

STAT21wb».  a  program  written  in  BASIC  for  multiple  linear  regression 
with  detailed  output.  Written  by  Gerald  Childs,  Dartmouth  College, 
Hanover,  N.  H.  Available  in  the  C-E-I-R  Multi -Access  Computer 
Services  library  [10], 

WRAP,  Weighted  Regression  Analysis  Program.  A  FORTRAN  II  program 
written  by  M.  D.  Fimple,  Sandia  Corp.,  Albuquerque,  New  Mexico. 
Available  in  the  SHARE  library*  7090-02  3231WRAP  [23]. 
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ERROR  ANALYSIS  FOR  CONTROL  SYSTEMS 


T.  H.  Slook 

Temple  University  and  Frankford  Arsenal 
Philadelphia,  Pennsylvania 


I.  INTRODUCTION .  From  the  days  following  World  War  II  to  the 

present  time,  many  research  papers  and  books  have  been  written  on 
feedback  control  systems.  In  almost  every  case,  these  publications 
emphasize  the  analysis  and  design  of  such  systems.  Relatively  few 
pages  have  been  devoted  to  error  analysis  techniques  for  control 
systems.  The  important  contributions  which  this  paper  makes  are: 

A.  To  exhibit  an  error  analysis  technique  for  an  arbitrary  control 
system;  and, 

B.  To  prove,  in  a  general  setting,  three  theorems  relating  the 
variances  and  power  spectral  densities  of  the  inputs  and  outputs  of 
such  systems. 

II.  MEASURES  OF  EFFECTIVENESS.  Every  measure  of  effectiveness 
for  a  control  system  involves,  either  directly  or  indirectly,  some 
knowledge  of  system  errors.  To  demonstrate  this  point  and  to  make  this 
paper  more  meaningful  and  less  abstract,  let  us  consider  a  fire  control 
system  (FCS).  Such  a  control  system  includes  tracking  servos,  data 
transmission  devices,  conversion  elements,  analog  and/or  digital 
computing  components  and  weapon  pointing  servos,  each,  of  which, 
possesses  errors  and  contributes  to  the  overall  system  output  errors. 
Clearly,  the  magnitude  and  frequency  of  the  output  errors  determine 

the  control  system's  effectiveness. 

Two  of  the  many  measures  of  effectiveness  for  a  FCS  are  hit 
probability  and  kill  probability.  To  be  specific,  the  single  shot 
engagement  hit  probability  is  obtained  by  evaluating 


(A/(2ir  «  o^))  x 


where 

n  -  the  number  of 

A  -  target  area, 

2 

0^  ■  variance  of 

2 

a  .  ■  variance  of 


rounds  for  an  engagement, 

the  bias,  and 
the  dispersion. 


[II-l] 
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In  this  paper,  the  bias  b  is  the  deviation  of  the  center  of  impact  of 
n  rounds  from  the  target  center,  and  the  dispersion  d  is  the  square 
root  of  the  average  value  of  the  square  of  the  deviations  of  the  rounds 
from  the  center  of  Impact. 


Observe  that  P,  defined  above,  is  a  function  of  a.  and  o.  .  These 

b  d 

variances,  whether  used  to  calculate  P  or  any  other  measure  of  effective-' 
ness ,  depend  upon  the  variance  in  the  error  in  the  elevation  a  2  and  the 

eE 

2 

variance  in  the  error  in  train  a  of  the  gun  tube  or  launcher  throughout 


the  firing  interval  and  each  of  these  statistical  measures  depend  upon: 


a.  errors  in  the  inputs  to  the  control  system, 

b.  non-ideal  system  element  errors, 

c.  systsm  function  approximations,  and 

d.  vehicle-target  paths. 


Let  us  agree  to  call  the  above  error  sources  the  system  input  errors 
for  a  FCS.  Observe  that  (a),  (b) ,  and  (c)  are  system  Input  errors 
for  every  control  system,  and  that  (d)  is  an  additional  error  source 
that  must  be  considered  in  a  FCS  error  analysis. 


The  fact  that  every  control  system  consists  of  an  assemblage  of 
a  finite  number  of  components,  each  of  which  has  measurable  characteristics, 
generates,  in  a  natural  way,  a  finite  number  of  equations  relating  the 
inputs  and  outputs  of  the  control  system.  These  equations  are  called 
the  system  equations.  Some  of  the  system  equations  may  be  empirical. 

For  example,  the  ballistic  functions  are  empirical  equations  in  a  FCS. 


A  relatively  easy  and  straightforward  error  analysis  is  possible 
when  the  system  equations  are  not  differential  equations.  However, 
many  control  systems  and  most  FCS  generate  an  independent  set  of  dif¬ 
ferential  equations.  The  inclusion  of  differential  equations  complicates 
the  solution  of  the  system  of  error  equations.  This  we  now  explain. 


III.  SYSTEM  ERROR  EQUATIONS.  Consider  a  FCS  of  q  system 
elements  having  s  independent  inputs.  This  means  that_at  each 
instant  of  time,  every  system  element  will  have  at  most  ]R  ■  {x^,  Xj,...^} 

inputs  from  outside  the  FCS  and  at  most  Y  ■  {y^,  y^,  ...,  y^}  inputs 

from  within  the  FCS;  see  Figure  1.  Observe  that  the  external  inputs 
xj  (J  ■  1,  2 . s)  and  the  system  element  outputs  yk  (k  -  1,  2 . q) 

are  functions  of  time  and  it  is  customary  to  assume  that  these  inputs 
and  outputs  have  continuous  first  derivatives. 
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In  Figure  1,  Che  output  at  time  t 
having  {x^  ,  ,  ...  x4  },  a  subset 


til 

of_the  i  system  element 
of  X,  as  external  inputs  and 


Ki’ 


y^  },  a  subset  of  Y,  as  internal  Inputs  is 

Pi 


described  by 


y 


i 


» 

1  *1 


x.  i • • • ,x  i  y,  i  y.  * • 
2  1  2 


•y< 


tiii-i] 


fcU 

Those  x's  and  y's  which  are  not  inputs  to  the  i  system  element  are 
not  in  the  domain  of  g  .  The  function  g.  is  called  the  performance 

operator  of  the  i  system  element,  and  it  determines  the  output  y.  of 

fcH  * 

this  system  element.  Figure  1  shows  that  the  output  of  the  1  system 
element  is  also  an  input  and  for  a  feedback  loop,  we  prefer  to  write 
the  performance  equation  In  the  implicit  form, 


f(x,  ,  x.  x.  ,  y,  ,  y.  , ,yU ,. .. ,y,  )  -  0,  [III-2] 

r  11  *  p  \ 

i  ■  I. 

■  th  ■ 

The  only  change  in  the  performance  equation  of  che  1  system  element 
for  a  non-feedback  loop  would  be  the  deletion  of  y,  as  an  ltiput  variable 
inequation  [ III— 1] .  l  \\ 


Inputs  from 
outside  FCS 


Inputs  which 
are  outputs 
of  system 
elements  of  FCS 


i 

ii 


FIGURE  1 
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In  practice,  each  input  to  the  i  system  element  may  possess 

an  error  and  we  denote  the  error  by  ew  (j  ■  1,  2, . . . ,  r4  )  and 

iJ 

f  (k  ■  1 ,  ^.....p  ),  Since  the  system  element  is  not  an  ideal 
y±k 

element,  the  output  of  this  imperfect  element  is  the  correct  output 
plus  the  system  element  error  m^.  Each  of  these  errors  may  also  be 
considered  as  functions  of  time;  see  Figure  2. 


x .  +  e  _ 

h  \ 

X.  +  E  — 

:±2  \ 
K  +  ex  - 

ri  V 


yi  +  ey 
1  yil 

yi  +  Ey 
X1  yi. 


yi  +  £y-r 


- *yA  +  «± 

mt  is  output  error 

of  ith  system 
element  due  to 
imperfect  ith 
system  element. 


FIGURE  2 


Glace  performance  operators  are  smooth  functions,  then  each 
(  “  1,  2,...,  q)  possesses  continuous  first  partial  derivatives 

with  respect  to  the  external  and  internal  inputs.  This  Implies  that 
every  f ^  possesses  this  property.  Hence,  the  change  in  f^,  produced 

by  increment  changes  e  ,  e  ,  and  m,  is 

xi  yk  A 

i  Ki 


Afi  "  fi(V  ~  £i<pi) 


UII-3] 


‘1 

y 

!fi 

‘'i 

•  ex  + 

3fi 

• 

e  + 

3fi 

Immt 

j-1 

S 

x. 

pi  s  k-i 

% 

pi 

\ 

3y± 

(X.  |  « 

. . ,  x± 

*  y±  *  • • • »  y±>‘' 

•*  yi 

) 

and 

Q  «  (x  +  e  +  e„  ,  y .  +  e  , 

1  1  i  r  xi  yi 

ix  t±  i^  l 

•  •  •  *  y*  +  e  +  », . . .  +  e  )  • 

►i  \ 

The  points  P ,  and  are  in  the  domain  of  f  ,  thus  Af^  ■  0  and 
equation  [ III— 3]  becorus 


y  3fi 

1  9/j 


k-1 


P  yi 
i  k 


-E 


j-i 


3fi 

wi 


•  e 


P  Xi  3yi 
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This  is  the  error  equation  for  the  i  system  element 4describable  by 
other  than  a  differential  equation.  Therefore,  the  set  of  error 
equations  for  a  FCS  with  s  external  inputs  and  q  system  elements 
describably  by  other  than  differential  euqatlons  is  the  linear  system 
of  equations  A  er  ■  B.  This  matrix  equation  we  prefer  to  write: 
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The  above  technique  nay  be  employed  to  generate: 


which  la  the  set  of  error  equations  for  a  FCS  with  s  external  inputB 
and  q  system  elements  descrlbable  by  differential  equations. 
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Using  f HI— 5 ]  or  fIII-61  and  a 
one  can  determine  the  system  output 


crlvAn  aAf 

error  vector 


e 


y2 


•  •  » 


4  «ati*  — 

-“f-  “  *- 


providing  the  given  set  of  system  equations  can  be  solved.  Observe 
that  the  coefficients  in  both  systems  of  equations  are  functions  of  the 
arbitrary  but  fixed  points  P^.  Thus,  £  III— 5 ]  is  easily  solved  for  eyi 

but  [ III— 6 J  is  not  easily  solved  for  when  one  or  more  infinite  series 

expansions  occur.  It  is  not  the  purpose  of  this  paper  to  discuss  conditions 
for  a  solution  to  [1II-6]  because  the  external  input  errors  for  a  control 

2 

system  are  given  as  variances  a  and  not  as  e  (i  ■  1,  2, . . .  ,q) . 

Xi  xi 

Hence,  the  main  problem  is  to  express  the  variance  in  the  output  errors 
as  functions  of  the  variances  in  the  system  input  errors.  For  a  FCS 

2 

this  means:  express  the  variance  in  elevation  error  o  and  the  variance 

®E 

2 

in  train  error  a  as  functions  of  the  variances  in  system  input 
“T 

errors.  This  we  now  discuss. 


IV.  STATISTICAL  MEASURE  OF  OUTPUTS.  To  exDteas  o  2  and  o  2  as 

- - - eg  cj 

functions  of  the  variances  in  the  system  input  errors  require  that  we 
prove  several  remarkable  theorems.  One  may  omit  the  proofs  if  he  so 
desires,  because  the  theorems  are  proved  only  for  the  sake  of  completeness. 


Let  L'  (u)  be  the  Banach  space  of  summable  functions  defined  on 
X  ■  { t :  <  t  <  +  «}  with  u  as  Lebesque  measure  and  |  |x|  ■  ^|x|dx. 

The  following  theorem  exhibits  a  relationship  between  the  derivative  of 
the  variance  in  the  variable  x  with  respect  to  frequency  and  its  power 
spectral  density. 

Theorem  1.  Let  x  eL*  (u) .  Then 

ST  <">>  '  ST  V  <“>  -  3T  1^)  >2>  I1*'1’ 

where 

2 

a)  ux  (ijj)  is  the  variance  of  x. 


b)  (u)  is  the  power  spectral  density  of  x, 

c)  x  (u)  is  the  mean  of  x. 
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-VniM****,. »wsamuKuaat^i  teigfr"  • 
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we  obtain 


■4  <»x2  <“>> ■  4  <  x2(“>>  -  4  < (xM  >2> 


■  —  *xx<“>  -  T.  «*<“>  >  > 


x  (w)  “  /“  x(t)dt,  -OB  <•  w  <•  +  CO  , 

<■00 


In  the  above  theorem,  x  may  be  thought  of  ae  an  input  to  a 
system  element  having  response  r.  The  output  y  for  this  system 
element,  in  its  most  general  form,  is  given  by  the  convolution  integral 


where 


y (t)  ■  jH  x(z)r(t  -  z)  dz, 


■00  <  fc  <  + 


Theorem  2  gives  the  relationship  between  the  power  spectral  density 
of  x  and  the  power  spectral  density  of  y. 

Theorem  2.  If  r  and  x  belong  to  L'(u),  then  the  power  spectral 
density  of  y  is  given  by 

't’yy(w)  *  I  R(u)|2*  *xx(w) 

where  R(u)  is  the  transfer  function  of  the  system  element.* 


[ IV— 3  J 


Proof.  Since  the  output  y  is  defined  by  a  convolution  function 
whose  determining  functions  r  and  x  belong  to  L’(y),  then  y  e  L'  (y) 
and  the  autocorrelation  function  of  y 


*It  was  brought  to  my  attention  by  a  member  of  the  audience  that 
this  property  was  known  to  Norbert  Weiner.  However,  it  should  be 
mentioned  that  in  August  of  1958,  the  team  of  Tappert,  Pfellsticker, 
and  Slook,  having  no  knowledge  of  Weiner's  result,  proved  this  property 
in  two  entirely  different  ways. 


135 


*yy  (T)  “  S  "IT  /-TT  y(t)  y(t  +  T)  dt 

exists.  Replacing  y(t)  and  y(t  +  t)  by  their  respective  convolution 
integrals  we  obtain 


(t) 


lim  * 
T-«>  2T 


{  £  *<u>  r<*  -  u^du>  (  £x(v)r(t  +t 


v)dv}  dt 


Let  -  v  -  t  -  u  and  -  p  ■  t  +  t  -v,  then  the  above  equation  becomes 

fT 

*  (t)  -  lim  JL__  >  {  r  x(t  +  v)  r(-v)dv)  {  j\( t  +  t  +  p)r(-p)  dp>  dt 

yy  t->®  2t  -t 

■  C  £r<-v>  r<-pH  [T  x(t  +  v)  x(t  +  t  +p)dt }  dvdp 

C IV— 4 ] 

„  .  f  T+v 

■  £  Lr(-v)  r(-p){  ^  ^  '_T+V  x(u)  x<u  +  t  +p  -v)  du  }dvdp 

■  12  £*<-*>  r<-p)  *xx  <t  +  p  ■ y)  dvdp  • 


The  change  in  the  order  of  integration  is  possible  because  the  conditions 
of  the  Fubini  theorem  are  satisfied. 

The  Fourier  integral  of  4^  (w)  defines  the  power  spectral  density 
of  y,  that  is 

*yy(“>  "  C  *yy  (x)  dl 

In  this  integral,  replace  (r)  by  the  last  relation  described  in 

equation  [IV— A] .  Thus  S^Cw)  becomes 


*yy<w)  "  £  t£  12  r(_v>  r("p)  ^xx(t  +  P  "  v)  dv  dp  }  e"3wT  dT 
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-  C  <£  w*  +  •-»>  *-  ■>»}  •'j"(T+ir''i  it 

“  H  r<“v)  e^IJV  dv}  {  {“  r(-p)  e”^  p}  # 


The  change  in  order  of  integration  given  above  is  possible  as  the 
conditions  of  the  Fubini  theorem  hold. 

The  Fourier  transform  of  the  response  r, 


R(<o)  ■  [2  r(v)  e’^lJV  dv 

is  called  the  transfer  function  of  the  system  element.  Substituting 
for  the  integral  forms  in  equation  f IV— 4 ]  their  equivalent  functions 
R(u>)  and  R(w) ,  one  obtains  the  desired  functional  relationship. 


*  (to)  -  R(u)  R(u)  $  (u>) 

yy  xx 

«  |R(w)  |2  ^(w)  • 

As  a  consequence  of  theorems  1  and  2,  we  obtain  a  useful  relationship 
between  the  variance  of  the  input  and  the  variance  of  the  output  of  a 
system  element.  This  result  we  embody  in  the  following  theorem. 


Theorem  3.  Let  x  be  the  input  to  a  system  element  having  response  r  and 
output  y.  If  r  and  x  belong  to  L’  (u)  and  x(w)  ■  y(w).  then 


C  lR<*>l  2 


d  {ax2  (u)} 
dm 


du 


Proof:  Combining  the  relationships  of  theorems 

(1)  and  (2) ,  we 

obtain 

f  0  9  1 

/ 

r  9  /w  oi\ 

j 

[  d{o  *(u)}  d  {(y<w)  )Z} 

>-  |R<u)|2 

1  , 

|d{ox2(w)}  d{(x(w)  )*}> 

jdaj  du  | 

jdaj  doi  J  / 

which  reduces  to 


d  {o  2  <u)>  ,  d  { o  2  (w)> 

S  ’  ■  2  JT 


V;\ 


\i 

% 

| 
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fence 


d  {c„  (w)  } 


d {c 


rt  _  «  t  “  l  V  _  VUW 

°y  *i-dS  dW  *  -ClR(W>  ^  d“  * 

Theorem  3  established  for  a  system  element  consisting  of  a  single  input 
and  a  single  output  the  functional  relationship  between  the  variances 
of  these  variables.  Applying  this  technique  to  the  system  element 
illustrated  in  Figure  2  which  has  +  1  sources  of  error  e* 

(j  •  1,  2 . r^,  e  (k  ■  1,  2 . p^  and  n^,  we  obtain 


1  1 

d{tJ2  (u)  +o  2<u>)}  2  ST  d{0  2(u)>  V"  d 

*  \  1  +  L  *  6 

j-l  k*l 


<u)} 


which  reduces  to 


d  {o  (u>)} 

Zi  yi  * 


,  (w) 


d{o_  *<«>} 


2  JL  d{0„  2(w) }  d  (c_ 


dm  yik  "  lRi^w)l  ^  dZT  ^  ~d^  * 


k-1 

(■ 


j-l 


Therefore,  the  set  of  variance  error  equations  for  a  FCS  with  s  external 
Inputs  and  q  system  elements  may  be  written: 

P1  d  {e„  2<w)}  ,  rl  d{o  2  (w)>  d{o  2(a>)} 

—  *  r"1 _ e„  ®i 

du> 

J-l  J 

C IV— 6 1 

2/  ...  2 

e„ 


■i<“>  ■  I  dl  \  "  *Ri<w)  1  E  dS  xi  "  ~  “1 

t _ 1  ” 


k-1 


Sq(o.)  . 


E 

k-1 


d  {o  Z(oi)>  ,  rq  d  {o  Z(u)>  d{a„  <w)  > 

E“  -  |R  (u)  | 2  ~  e“  — 


dT  y 


I 

j-l 


j  x 

au  q 


J 


du  ^ 


Ca>)> 
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Sul ve  i.ula  system  of  linear  equations  In 


d  (a„  2  <u)} 

duT  “yik  (i  "  lf  2 . .  q) 


for  the  outputs  desired.  In  the  case  of  a  FCS  one  would  solve  for 

Thus,  the  variance  in  elevation  and 
the  variance  in  train  may  be  calculated  by  using  equations 


d  fo  2(w)}  d  {o  ?<w)} 

-r-  eE  and  -z —  ET 

do)  du 


o  ‘  -  f*  -r~  “E 
Ej,  du 


d  { <w> > 


du 


„  d  {a  '(«)} 

2  -  r  7-  eT  du 

du 


Observe  that  the  above  technique  provides  a  means  for  determining 
variances  of  the  output  errors  of  a  control  system  deacribable  by 
differential  equations.  These  variances,  as  demonstrated  In  the  fire 
control  example,  may  be  used  to  determine  a  measure  of  effectiveness 
for  the  control  system. 
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ABSTRACT.  This  is  an  expository  paper  on  the  analysis  of 
contingency  tables  given  at  the  Fourteenth  Conference  on  the  Design  of 
Experiments.  The  principle  of  minimum  discrimination  information  estima¬ 
tion  is  described  and  used  to  generate  estimates  for  tests  of  hypotheses 
concerning  second-order  and  higher-order  interactions.  All  classical 
hypotheses  for  contingency  tables  can  be  generated  by  the  use  of  this 
principle  when  certain  marginals  are  considered  as  fixed. 

Examples  are  given  and  two  available  computation  programs  are 
described  in  detail. 

I.  INTRODUCTION.  In  the  January  issue  of  the  Journal  of  Royal 
Statistical  Society,  there  is  a  paper  by  M.  G.  Kendall  (1968)  entitled, 
"On  the  Future  of  Statistics  -  A  Second  Look."  A  particular  paragraph 
In  his  paper  concerns  the  topic  under  discussion  today.  We  quote: 

19.  It  is  rather  a  hazardous  task  to  try  to  forecast 
the  future  lines  of  development  of  theoretical  statistics, 
but  there  seem  to  me  to  be  two  major  growing  pointB  and  I 
should  like  to  consider  them  In  some  detail.  The  first 
concerns  the  bridging  of  the  gap  between  theory  and  prac¬ 
tical  requirements  in  multivariate  analysis.  The  problems 
which  are  encountered  in  nearly  all  statistical  enquiries 
concerned  with  this  subject  are  very  far  from  being  solved. 

I  will  cite  a  few  examples  from  what  might  be  a  very  long 
list: 

(1)  Multiple  contingency  tables.  The  problems  of  manifold 
classification  in  p  dimensions  are  of  three  kinds; 
the  pure  problem  of  display  so  that  one  can  look  at 
the  results  as  a  whole;  the  problem  of  empty  cells, 
or  email  frequencies,  which  are  apt  to  arise  on  the 
edges  of  a  table  even  for  large  samples;  and,  perhaps 
the  most  difficult  of  all,  a  method  of  analyeis  which 
will  bring  out  the  various  inter-relationships  among 
the  classif icatory  variables. 

*Supported  in  part  by  the  Air  Fovce  Office  of  Scientific  Research, 

Office  of  Aerospace  Research,  United  States  Air  Force,  under  grant 
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Ke  agree  with  Kendall  on  both  counts:  that  the  problem  needs  further  . 
Investigation  and  the  problem  Is  a  difficult  one.  The  procedure  we 
present  today  proposes  A  unified  treatment  of  multl-dlmensional  contingency 
tables,  and  we  believe  It  to  be  a  step  in  the  right  direction. 

A.  Formulation  of  the  Hypothesis.  The  formulation  of  a  meaningful 
nypothesis  of  no  Interaction  in  a  multi-way  table  is  not  as  simple  as  one 
might  expect  at  first.  For  the  simplest  case  beyond  a  two-way  table,  a 
2x2x2  table,  with  modified  conventional  notations  as  shown  in  Figure  1.1, 
Bartlett  (1935)  defined  "no  second-order  interaction"  as  implying 


p(lll)p(221)  p(112',  ?(222) 

(1.1)  -  -  - 

p(121)p(211)  p(122)p<?.12) 


Bartlett *8  Definition 

No  Second-Order  Interaction  for  a  2  x  2  x  2  Table 


p(lll)p(221>  p(112)p(222) 

■  ■  — 1«. ■». i  i  ■in.»«A.  ■  '  "■  1 

p(121)p(211)  p(122)p(212) 


Figure  1.1 

This  definition  la  essentially  an  extension  of  the  cross-product  ratio 
definition  of  independence  in  a  2  x  2  table.  The  hypothesis  proposed 
is  the  equality  of  association  between  classifications  R  and  C  within 
and  Dj.  How  would  one  go  about  to  extend  this  formulation  to  higher 

dimension  tables  with  more  th^n  two  categories  within  each  classification? 
How  many  relations  of  the  fcrm  (1.1)  does  one  need  to  express  the  hypothesis 
of  no  second-order  Interaction  in  such  cases?  These  questions  were  studied 
by  Roy  and  Kastenbaum  (1955) . 
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e.  computation  of  Expected  Frequencies.  Once  a  null  hypothesis 
Is  decided  upon,  the  next  step  Is  to  estimate  the  expected  cell  frequencies 
under  the  null  hypothesis  using  the  marginal  frequencies,  In  the  same  way 
as  we  estimate  cell  frequencies  under  the  independence  hypothesis  in  an 
r  x  c  table,  using 


x(i.)  x(.j) 

p(ij)  -  p(i.)p(.j)  -  — ~ —  •  — —  . 

where  x(i.)  -  ^x(ij),  x(.j)  =  ^x(ij),  I^xdj)  -  n,  and  x(ij)  is  the 

observed  frequency  in  the  ij-th  cell.  For  expression  (1.1),  Bartlett 
proposed  to  solve  for  A  in  the  expression 


tx(lll)  +  A]  [>;<221)  +  A]  (x (112)  -  Aj  [x(222)  -  Aj 

(i-2)  -  ■= - 

[x(121)  -  A]  [x(211)  -  A]  [x (122)  +  A]  [x(212)  +  A] 


which  is  a  third  degree  equation  in  A.  Note  that  this  implies  that  the 
two-way  marginals  are.  unchanged.  Then  a  statistic  X2  *  A*  ly  jjtl/xCijk) ] , 

asymptotically  distributed  as  x2  under  the  null  hypothesis,  can  be  computed 
for  a  test  with  one  degree  of  freedom.  For  a  three-way  r  x  c  x  d  cable, 
one  has  to  solve  (r-1)  (c-1)  (d-1)  third-degree  simultaneous  equations 
in  as  many  unknowns.  The  computation  involved  is  not  a  trivial  one! 

C.  Interpretation  of  Results.  Once  we  have  formulated  the 
hypothesis  and  performed  the  computations,  we  need  to  interpret  the 
results  in  terms  of  the  actual  physical  variables.  What  does  no  second- 
order  interaction  in  a  four-way  table  mean?  How  about  no  third-order 
interaction?  In  oome  cases  the  interpretation  may  be  quite  natural,  in 
other  cases  the  interpretation  would  be  rather  stretched.  A  general 
interpretation  that  may  apply  to  a  majority,  if  not  all,  of  the  cases 
would  be  extremely  desirable. 

H .  SUMMARY  OF  A  PROPOSED  PROCEDURE  FOR  THE  ANALYSIS  OF  MULTI¬ 
DIMENSIONAL  CONTINGENCY  TABLES.  We  now  discuss  a  procedure  for  the 
analysis  of  multi-dimensional  contingency  tables  which  we  believe  has 
general  applicability.  We  shall  sketch  the  principle  and  structure  of 
the  proposed  analysis  and  then  illustrate  the  procedure  with  a  four-way 
table.  For  details  see  Ku  and  Kullback  (1968)  and  Ireland  and  Kullback 
(1968).  The  one  by  Ireland  and  Kullback  contains  the  proofs  of  the  main 
results  and  applies  the  procedure  to  a  problem  of  data  adjustment.  The 
one  by  Ku  and  Kullback  applies  the  procedure  to  the  testing  of  hypotheses , 
in  particular  the  formulation,  estimation  and  testing  of  second-order  and 
higher-order  interactions.  We  shall  discuss  the  procedure  for  a  three-wa> 
table,  using  a  modified  form  of  conventional  notation. 
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•  -  -  “*v  vww  Lduic  ui  luitiem  |  we  may  visuaxize 

three  associated  tables  as  follows: 

(1)  The  ir-table  {ir(ijk)}.  The  ir-table  may  be  specified  by  the 
null-hypothesis,  given  by  observations,  or  estimated.  For  example,  the 
ir-table  may  specify  equal  probability  in  all  the  cells,  three-way  independence, 
etc. 


(2)  The  class  of  p-tables  denoted  by  {p(ljk)}.  A  p-table  is  a 
contingency  table  that  satisfies  certain  conditions  of  interest,  usually 
a  specification  of  the  marginals,  for  instance,  the  one-way  marginals 
p(i..),  p(.J.)  and  p(..k). 

(3)  The  p*-table  (p*(ijk)}.  The  p*-table  is  that  member  of  the 
class  of  p-tables  which  most  closely  "resembles"  the  ir-table  in  the  sense 
of  minimum  discrimination  information;  i.e.,  the  p*-table  minimizes  the 
discrimination  information: 

(2.1)  I(piir)  m  l  p  •  in  £ 
over  tho  class  of  p-tables. 

W1‘.  these  three  tables  fixed  in  mind,  we  shall  summarize  the 
main  resul  * 'en  in  the  two  references. 

A.  If  we  set  ir,(,ijk)  ■  “j-  ,  the  uniform  r  x  c  x  d  table,  then  the 

classical  hypotheses  of  independence,  homogeneity,  conditional  independence, 
no  second-order  interaction,  etc.  are  represented  by  p*-tables  when  certain 
marginals  are  considered  as  fixed,  and  can  be  considered  as  "generalized" 
independence  hypotheses.  Thus,  whenp(i..),  p(.J.),  p(..k)  are  fixed,  the 
p*-table  has  the  form  (for  any  ir-table) 

(2.2)  P*(ijk)  -  a(i)b(j)c(k)n(ijk) 

where  a(i),  b(j),  c(k)  are  determined  to  satisfy  the  marginal  restraints. 

It  turns  out  that  for  rr(ijk)  ■  , 

(2.3)  p*-table:  p*  (ijk)  -  p(i. ,)p(. j  ,)p(. ,k)  . 

When  two  of  the  two-way  marginals,  say  p(ij.)  and  p(i.k)  are  specified 
then  the  p*-table  takes  the  form 

p*(ijk)  -  a(ij)b(ik)ir(ijk) 

(2.4)  p(ij . )p(i.k)  1 

•  -  for  tt  ■  - 

p(i..)  red 
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When  all  three  rvn-t.,a y  marginals  ate  «-un»iuered  fixed,  the  p*-table 
has  the  form 


(2.5)  p*  -  table:  p*(ijk)  ■  a  (ij )b (jk)c(ik) u(ijk) 

and  the  p^-table  satisfies  Bartlett's  definition  on  no  second-order 
interaction  for  it  =  1/rcd,  since 


p*  (llt)p* (221)  a(ll)b(ll)c(ll)n  a(22)b(21)c  (21) it 

—  ■  — . -  ■  — ■  ■  ■  ■ .  '  m  —  —  •  ■  ■  —  —  —  ■  -  ■  -•  ■■■  -■  - 

p*  (121)p* (211)  a(12)b(21)c(ll)ir  a(2T)b(ll)c  (21)  it 

a(ll)a(22)  p*  (112^(222) 

“  a(12)a(21)  “  p*  (122)p*(212)  * 


A  straight  forward  convergent  iterative  procedure  is  given  later  to 
determine  p*(ijk). 


A  pictorial  representation  may  be  visualized  as  shown  in  Figure 
2.1.  Let  the  ordinate  represent  some  measure  of  association  or  dependence. 
Then  the  uniform  table  ir  would  be  at  the  zero  datum.  ,  Now  let  the  p-tables 
be  represented  by  the  series  of  regions  above  it.  If  there  is  no  restric¬ 
tion  on  p,  p  will  include  rr  and  p*  is  rr.  With  one-way  marginal  restraints, 
the  class  p  becomes  smaller  and  shrinks  away  from  it.  Then  the  p*  table 
is  the  one  closest  to  tt  yet  satisfies  these  one-way  marginal  restraints. 

With  all  two-way  marginals  fixed  (and  hence  also  the  one-way  marginals), 
the  region  shrinks  further  and  p#  is  the  table  closest  to  tt,  and  is  also 
the  table  closest  to  p*.  The  observed  sample  table  is  represented  by  a 
point  p  in  the  picture;  The  closeness  of  the  resemblance  from  one  table 
to  another  table  is  measured  by  the  discrimination  information,  and  the 
following  relationships  hold. 


(2.6) 


effects  of  marginals 

measure  of  lnt 

I(p*t) 

-  I(p*:ir) 

+ 

I(p:p£) 

I  (p :  u ) 

=  I  (pj£  •  it ) 

+ 

I(p;p!j[) 

I(p$:tt) 

-  I(p*:tt) 

+■ 

Kp*[:pf> 

I(p:p*) 

-  I(p*:p*) 

+ 

I(P:P*) 
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In  general, 
corresponds 


If  n*  rnrrpannnHa  to  2  Set  H  ef  given  marginals  and  j/g 
to  a  set  of  given  marginals  where  $  Ha>  then 


I(p;p*)  *  I (p*:p*)  +  I(p:pjjj). 

B.  The  values  of  the  p*-table  can  be  computed  by  an  iterative  scheme 
which  adjusts  the  t(ijk)  to  satisfy  successively  the  given  marginal 
restraints.  For  a  three-way  table  when  all  two-way  marginals  are  given, 
we  cycle  through 


p(0)(ijk)  =  rr(ijk)  - 

(3n)(ijk) 
p(3n+1)(ijk) 

p(3n+2)(ijk)>  n  .  0j  lf  itt 

It  can  be  shown  that  the  iteration  converges  to  p*  and  p*  is  unique. 
For  (2.4)  the  iteration  is  completed  at  the  end  of  the  first  cycle. 

C.  The  p*~table  provides  RBAN  (Regular  Best  Asymptotically  Normal) 
estimates  under  the  given  constraints,  and 


(2.7) 


p(3n+1)(ijk) 


p(3n+2)(ijk) 


P(ij •) 


p(3n)(ij.) 


p(i.k) 


p(3n+l) (lk) 


p(3n+3)(ijk) 


P(.jk) 


P(3n+2)(.jk) 


x(ijk) 

2nl(p:p*)  ■  2I(x:x*)  ■  x(ijk)£n  "  1  1 

L  x* (ijk) 

2 

is  asymptotically  distributed  as  x  under  the  corresponding  null 
hypothesis,  including  the  no  second-order  interaction  hypothesis. 
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n 

n1T  ”  red  *  x  "  nP>  an^  x*  “  np*.  Usually  3  to  7  cycles  are  sufficient 

to  obtain  agreement  between  marginals  to  within  .01  or  .001,  when  more 
than  one  cycle  is  required. 

Now  let  us  consider  these  results  with  respect  to  the  three 
problems  raised  at  the  beginning  of  this  paper;  i.e.,  the  problems  in 
the  formulation  of  the  hypothesis,  the  computation  of  expected  cell 
frequencies,  and  the  interpretation  of  results. 

First,  we  have  defined  a  measure  of  "closeness"  between  two  discrete 
distributions  by  the  discrimination  information  given  in  (2.1).  A 
hypothesis  of  Interest  is  usually  concerned  with  independence  or  asso¬ 
ciation  between  various  classifications.  By  necessity,  the  expected 
cell  frequencies  under  such  hypotheses  will  have  to  be  estimated  from 
observed  marginal  frequencies.  Hence,  all  these  hypotheses  are  members 
of  the  "generalized"  independence,  or  no  interaction  hypothesis,  rep¬ 
resented  by  the  table  which  is  closest  to  the  uniform  n-table,  subject 
to  various  marginal  restraints.  These  tables  are  the  p*-tables  in  our 
procedure. 

Second,  we  have  an  Iterative  scheme  for  the  computation  of  p*  or 
np*.  There  are  two  computer  programs  available  which  we  shall  discuss 
later. 


Me  shall  dwell  on  the  third  problem,  the  interpretation,  at  some 
length,  since  this  is  the  aspect  in  which  we  are  most  interested.  Me 
shall  give  first  a  general  interpretation  and  then  details. 

Me  may  consider  the  complete  sample  table  to  contain  all  the 
"information"  available  from  the  particular  experiment.  In  the  process 
of  analysis,  we  aim  to  express  the  sample  table  in  a  reduced  number  of 
parameters  represented  by  some  or  all  of  the  marginal  totals.  In  other 
words,  we  are  interested  in  knowing  how  much  of  this  total  information 
is  contained  in  a  summary  consisting  of  sets  of  marginal  tables. 

If  there  is  no  first-order  interaction,  i.e.,  there  is  independence 
of  all  classifications,  then  all  the  information  is  contained  in  the 
first-order  marginals  in  the  sense  that  given  these  marginals,  the 
complete  table  can  be  constructed  to  within  sampling  error.  If  the 
first-order  Interaction  is  significant,  but  there  is  no  second-order 
interaction,  then  the  set  of  two-way  marginals  will  be  required  to 
summarize  the  data  adequately.  The  use  of  two-way  tables  to  summarize 
multi-way  classification  data  is  a  rather  common  practice,  and  the 
Implied  assumption  is  therefore  "no  second-  and  higher-order  interactions." 
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A  direct  consequence  of  this  interpretation  is  that,  the  analysis 
car.  be  icuulcu  lu  t'nac  or  tne  set  or  marginal  tables  if  there  is  no 
interaction  of  the  same  order. 

We  remark  that  the  set  of  marginal  tables  must  be  considered 
jointly  for  proper  interpretation,  and  if  one  or  more  of  these  tables 
show  significant  interactions,  the  results  of  tests  of  the  remaining 
tables  could  lead  to  erroneous  conclusions.  An  example  of  such  a  case 
was  given  in  Sirnpson  (1951). 

The  above  interpretation  is  not  restricted  to  complete  sets  of 
marginals.  If  the  p*-table  computed  from  three  out  of  the  six  two-way 
marginals  in  a  four-way  table  is  found  to  be  "close  enough"  to  the 
p-table  by  our  test,  the  three  two-way  marginal  tables  could  be  con¬ 
sidered  as  containing  essentially  all  the  Information  in  the  four-way 
table.  The  analysis  can  therefore  be  performed  on  these  marginal 
tables  and  the  complexity  of  the  problem  reduced.  For  example,  the 
analysis  for  a  four-way  table  may  be  reduced  to  that  of  one  two-way 
and  two  three-way  tables,  or  to  that  of  three  two-way  tables  and  one 
three-way  table,  provided  that  the  corresponding  interactions  are 
found  to  be  not  significant. 

On  the  other  hand,  we  may  also  wish  to  estimate  the  effects,  or 
contributions,  of  the  specified  marginal  tables.  An  analysis  of  informa¬ 
tion  table  can  be  constructed  using  the  relationships  given  in  (2.6) 
wherein  all  the  components  of  information  are  additive  as  well  as  the 
associated  degrees  of  freedom.  The  interpretation  of  such  a  table  is 
very  similar  to  that  of  an  analysis  of  variance  table. 

We  remember  that 


(2.9) 


p*  (ijk)  »  a(i)b  (j)  c(k)  Tr(ijk) ,  or 

in  p*  »  in  a(i)  +  in  b(j)  +  in  c(k)  +  in  it  (ijk) 


which  compares  with 


(2.10)  E(y)  *  u  +  +  Cj  +  6^ 

the  usual  model  for  analysis  of  variance.  Thus  each  model  can  be 
expressed  as  the  sum  of  a  grand  mean,  a  row  effect,  a  column  effect, 
and  a  depth  effect.  Instead  of  a  linear  additive  model,  we  have  a 
logarithmic  linear  additive  model.  This  fact  is  interesting  in  the 
sense  that  we  did  not  assume  such  a  model  to  start  with  as  others  have, 
but  ended  up  with  this  model  by  minimizing  the  discrimination  information. 
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—  ^iwutsuuic  lu  mux  ex  van  ate  regression 

analysis.  We  are  in  fact  fitting  the  observed  frequencies  using  the 
marginals  as  variables.  The  "a's,"  ’Vs,"  and  "c’s"  are  the  fitted 
coefficients.  If  the  effects  of  the  two-way  marginals  are  small, 
then  p§  *  p*,  and  the  values  of  a(ij),  b(Jk),  c(ik)  are  close  to  unity. 

The  additional  effect  of  the  two-way  marginals  given  the  one-way 
marginals  is  represented  by 


(2.11) 


2nl(p^sp*) 


2 1  ijk  xjjjf(ijk)fn 


x|(ijk) 

xj(ijk) 


-  2nT.  (p:p*)  -  2nl(p:p*)  , 


or  the  difference  between  the  information  statistics  measuring  the 
first-order  interaction  and  the  second-order  interaction.  Since 


p2 

to  -  *  to  a<ij)  +  to  b(ik)  +  to  c(jk), 

pi 

we  could  write  also 


(2.12)  2nl(p*:p*)  -  2\  ^xUj.)  to  a(ij)  +  2£  1{tx(i.k)  to  b(ik) 

where  a(lj),  b(ik)  and  c(jk)  can  be  computed  as  products  of  ratios  of 
marginals  in  the  Iteration  process  using  p*  as  the  starting  table 

P<°W  1 

While  (2.12)  is  algebraically  correct,  and  the  value  of 
2nI(pj[:pJ)  is  unique,  the  three  components  appearing  on  the  right 

side  of  the  equation  are  not  necessarily  independent,  and  the  computed 
values  of  these  terms  depend  on  the  order  of  the  two-way  marginals 
within  the  cycle  of  iteration.  The  properties  of  these  components 
need  further  investigation. 

Hence,  if  a  breakdown  of  the  two-way  marginal  effect  is  desired, 
a  conditional  approach  is  necessary;  i.e.,  the  two-way  marginal  re¬ 
straints  are  considered  in  successive  sets,  where  each  set  implies  the 
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preceding  one,  and  the  effect  of  a  narr-frnior  tv;o— v;cy  marginal 
computed  conditioned  on  the  preceding  set  of  two-way  marginals  that 
had  been  fixed.  An  example  is  given  for  a2x2x2x2  table  in 
Appendix  A,  together  with  computer  print-out  and  details  of  computation. 
This  procedure  is  discussed  in  further  detail  in  Ku  and  Kullback  (1968), 
and  the  results  of  computing  these  effects  by  different  approaches  are 
compared  for  two  four-way  tables, 

The  main  advantage  in  using  an  analysis  of  information  table  such 
as  that  given  in  Appendix  A  is  that  the  table  presents  an  additive 
analysis  of  the  complete  contingency  table,  rather  than  just  a  special 
segment  of  the  analysis,  say  the  hypothesis  of  no  second-order  inter¬ 
action.  Therefore  it  aids  in  seeing  the  picture  as  a  whole  and  in 
understanding  its  underlying  structure. 

We  list  also,  in  Table  2.1,  a  number  of  results  from  examples 
appearing  in  current  literature. 

III.  THE  COMPUTER  PROGRAMS .  There  are  two  computer  programs 
available  for  the  analysis  of  contingency  tables  by  the  procedure 
described  above.  These  programs,  designated  KKV68A  and  KKV68B, 
respectively,  are  written  in  double  precision  FORTRAN  V  language  and 
are  now  on  FASTRAND  at  the  National  Bureau  of  Standards  for  use  with 
its  1108  computer.* 

The  two  programs  are  basically  similar  and  can  be  used  for  the 
analysis  of  up  to  four-way  tables.  Whereas  program  A  allows  more 
categories  for  each  classification  than  B,  program  B  allows  some 
options  that  are  not  available  In  A.  We  shall  describe  these  programs 
in  detail  and  note  the  computation  and  options  that  are  available. 


♦Because  of  difference  in  compiler  and  peculiarities  of  behavior 
of  different  models  of  computers,  certain  minor  changes  may  have  to  be 
made  before  these  programs  will  work  on  other  computers.  We  would  be 
happy  to  furnish,  to  persons  interested  In  using  these  programs,  card 
images  on  a  blank  tape  to  be  sent  to  the  first  two  authors  at: 

Statistical  Engineering  Laboratory 
National  Bureau  of  Standards 
Washington,  D.  C.  20234 
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Data  from  Berkson  [1968] 


Three -Way  Tables 

2  2 

Comparison  of  2l(x:x*),  \  and  Minimum  logit  X  Values 


Example  from 

21 

Min.  Disc. 
Inf. 

2 

X 

MLE  and  Min. 
Disc.  Inf, 

X2 

Min.  Logit  D.F. 

Cochran,  2x2x3 

.854 

• 

OO 

\J1 

H* 

• 

CD 

-P" 

VO 

2 

Woolf,  2x2x3 

2.9655 

2.9839 

2.9811 

2 

Norton,  2x12x2 

7.71 

7.59 

7.37 

11 

Bartlett,  2x2x2 

2.2945 

2.2704 

2.2641 

1 

Ka'stenbaum 

and  Lamphiear,  2x3x5 

3.160 

3.158 

3.128 

8 

Data  from  Ku 

and  Kullbaok  [1968],  Bhapkar  and  Koch 

[1968] 

Four-Way  Tables 

Example  from 

Second -Order 
Interaction 

D.F. 

Third -Order 
Interaction 

Hoyt,  Krishniah,  and 
Torrance,  7x4x3x2 

172.257 

108 

44.793 

Hies  and  Smith, 

2x 2x2x3 

9-847 

9 

.739 

Kihlberg,  Narragon 
and  Campbell, 

2x  2x2x2 

7.33 

5 

’  .67 

Table  2.1 


(1)  Dimension  limitations. 


FOR 

R 

C 

D 

1 

KKV68A 

r  jf  9 

c  <  19 

d  <  9 

t  <  4 

KKV68B 

r  <  7 

c,  <■  9 

d  <  4 

t  <  3 

(1A)  Requirements  for  computer  memory  locations. 


FOR 

Code 

Data 

Total 

KKV68A 

7323 

41495 

48818 

KKV68B 

9387 

11670 

21257 

(2)  Input  data  and  options. 

A.  Title  cards  are  provided  for  each  of  the  classifications. 

B.  Tables  of  sampled  dates  X(IJKL)  •  Np(ijki) .  These  data  are 
punched  on  cards  in  7D  10.0  format)  and  read  in  by  columns  within  each 
row,  row  x  column  within  each  depth,  and  row  x  column  x  depth  within  each 
level . 

C.  In  program  A,  Ntt(IJKL)  ■  F(IJKL)  is  always  taken  to  be  equal  to 
n/rcdt  and  iterations  begin  with  these  numbers  for  each  of  the  three 
iterations  to  compute  Np*(ijki),  Npjjf(ijkfc)  and  Np^(ijki). 

In  program  B,  there  are  two  options:  (i)  The  first  option  uses 
F(IJKL)  ■  n/rcdt  same  as  in  program  A.  The  computation  sequence  in 
program  B,  however,  .differs  from  that  of  A  in  that  the  iteration  begins 
with  F(IJKL)  to  compute  the  cell  frequencies  far  the  no  firet-order 
interaction  NP4  ,  then  uses  NP*  as  input  to  compute  the  cell  frequencies 
for  no  second-order  interaction  NPS,  and  uses  NP^  Co  compute  cell 
frequencies  under  the  no  third-order  interaction^  NP*.  This  computation 
sequence  allows  the  calculation  of  the  effects  of  the  sets  of  marginals 
such  as  2NI(p*:ir),  2NI  (p*  spf>)  and  2NI(p#:p*),  and  their  components. 

(ii)  The  second  option  allows  the  input  or  a  table  of  F(IJKL),  after 
and  in  the  same  manner  as  X(IJKL).  This  choice  is  useful  in  adjusting 
data  to  fit  specified  marginals  -  a  topic  not  discussed  in  this  paper. 

The  fixed  marginals  will  be  those  of  X(IJKL),  The  table  F(IJKL)  is  the 
observed  table  to  be  adjusted  to  fit  the  marginals  of  X(IJKL),  An 
example  is  given  in  Ireland  and  Kullback  (1963). 


D.  For  program  A  only,  option  is  provided  for  the  choice  of  sets  of 
marginals  if  these  marginals  are  not  a  complete  sot  of  one-,  two-,  or 
three-way  marginals.  Iterative  computations  for  the  complete  sets  of 
marginals  are  always  automatically  performed. 

E.  Options  are  provided  to  specify  the  maximum  number  of  cycles  of 
iteration  for  the  computation  of  each  iteration,  and  also  for  the 
specification  of  the  tolerance  desired  between  the  original  marginal j 
and  the  computed  marginals .  Experience  has  shown  that  20  cycles  and 
agreement  to  0.01  are  usually  sufficient  for  most  problems. 

(3)  Outputs  and  options. 

The  following  notations  are  used  in  the  output: 


X(IJKL) 

Y(IJKL) 

Z(IJKL) 

W(IJKL) 

U(1JKL) 

R(I),  C(J) ,  etc. 
RC(IJ),  RD(IK),  etc. 


Observed  cell  frequencies 

Cell  frequencies  NP* 

Cell  frequencies  NP* 

Cell  frequencies  NP* 

Cell  frequencies  corresponding 
to  specified  marginals 

These  are  equivalent  to  a(i), 
L(j),  etc.  used  in  the  text 

These  are  equivalent  to  a(ij), 
b(ik),  etc.  used  in  the  text 


RCD(IJK) ,  RCT(IJL),  etc.  These  are  equivalent  to  a(ijk) , 

b(ij£)»  etc  used  in  the  text 

A.  In  program  A,  there  is  no  option,  the  print-out  is  arranged  in  the 
following  order: 


(i)  Titles  of  classification. 

(ii)  Original  tables  of  X(IJKL)  in  the  form  of  two-way  tables. 

(iii)  All  marginal  three-way,  two-way,  and  one-way  tables  and 
the  grand  total.  These  tables  are  useful  for  inspection 
if  there  are  rio  higher-order  interactions. 

(iv)  All  16  sums  of  quantities  of  the  form  2^X(IJKTJ)LNX(IJKL) 
computed  in  double  precision.  These  sums  are  useful  in 
testing  certain  hypotheses  as  illustrated  in  Kullback, 
Kupperman  and  Ku  (1962). 
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(v)  Number  or  complete  cycles  of  Iterations  performed  for 
each  interaction  computed,  and  the  tolerance  specified 
for  the  marginal  agreement. 

(vi)  Tables  Y ( 1JKL) 

2  £Y  ( IJKL) LNY ( I JKL ) 

First-order  interaction  2NI(p:p*) 

Chi-squared  "  V  ~ ■ 

L  Y 

Tables  of  residuals  «  X-Y 

X-Y 

Tables  of  normalized  residuals 

(vii)  Print-outs  under  (vi)  are  repeated  corresponding  for  Z 
and  W,  and  U  when  specified. 

B.  Options  available  in  print-out  of  program  B.  Print-outs  described  in 
Paragraph  A(i)-(vii)  above  for  program  A  remain  the  same,  except  for  the 
following  options: 

(1)  If  tables  of  coefficients  R(I),  ...,  RC(IJ),  ...,  RCD(XJK), 
...,  and  quantities  such  as  2SUM  X(I. . .)LNR(I. . ,) ,  . ..,  are  computed, 
then  these  numbers  will  be  printed  out.  Both  tables  of  residuals  and 
normalized  residuals  will  be  suppressed  in  this  case. 

(2)  Options  are  provided  to  print  either  the  residuals  or  the 
normalized  residuals,  or  both  if  the  coefficients  are  not  computed  and 
printed. 

A  sample  computer  print-out  is  given  in  Appendix  A,  and  the 
setup  for  data  cards  is  given  in  Appendix  B. 
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APPENOTX  A 


Oar a  used  in  this  example  are  Laken  from  the  Kihlberg,  Narragon, 
and  Campbell  study  of  the  relationship  between  car  size  and  accident 
injuries  as  quoted  by  Bhapkar  and  Koch  (1968). 

There  are  four  classifications  as  follows: 

R:  Driver  ejection  -  not  ejected  or  ejected 
C:  Accident  severity  -  not  severe  or  severe 
D:  Accident  type  -  collision  or  rollover 
T:  Car  weight  -  small  or  standard 

The  data  are  shown  in  the  2  x  2  x  2  x  2  table  at  the  beginning  of 
the  print-out. 

Before  we  discuss  our  procedure  of  analysis  on  this  set  of  data, 
we  wish  to  make  two  remarks: 

First,  Bhapkar  and  Koch  condensed  the  original  data  into  a 
2  x  2  x  2  x  2  table  presumably  for  convenience.  The  original  data  haa  4 
categories  in  accident  type,.  3  in  severity  and  3  in  car  weight,  and  is 
a2x3x4x3  table  for  drivers  who  were  alone  at  the  time  of  accidents. 
Hence,  all  conclusions  and  interpretations  given  below  are  strictly  for 
illustrative  purposes,  and  are  based  on  data  as  condensed  by  Bhapkar  and 
Koch. 

Second,  in  many  problems  of  data  analysis,  there  is  usually 
additional  knowledge  available  which  should  be  taken  into  account.  In 
this  example  for  instance,  there  is  a  time  element  linking  the  four 
classifications  in  the  sequence:  car  weight  -*>  accident  type  -*  accident 
severity  -*■  ejection.  The  dependence  of  one  classification  on  another  can 
only  go  from  right  to  left  and  not  in  the  reverse  order.  In  addition, 
there  is  the  distinction  of  a  "cause  and  effect"  relationship  between 
two  classifications  (by  logic  or  by  law-like  long  past  experience)  or  a 
mere  "association"  relationship.  Severe  accidents  are  likely  to  cause 
ejections  of  driver,  is  an  example  of  the  former;  ejection  of  driver  and 
car  weight  is  an  example  of  the  latter.  We  hope  to  use  these  additional 
bits  of  knowledge  to  make  our  analysis  more  meaningful. 

Analysis  of  information  Table  A.l  represents  a  preliminary  scan  of 
these  data  using  our  procedure.  Neither  the  third-order  nor  the  second- 
order  interaction  reached  significance.  The  value  for  the  no  third-order 
interaction  hypothesis  of  .67  checks  with  Bhapkar  and  Koch's  results. 
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marginal  Cables  jointly  contain  essentially  all  the  statistical  infor¬ 
mation  available  in  the  four-way  table,  or,  given  the  six  two-way 
marginal  tables,  we  could  approximate  the  four-way  table  to  within 
sampling  error.  Thus,  the  analysis  is  reduced  to  a  breakdown  of  the 
six  two-way  marginal  effects  into  individual  degrees  of  freedom.  We 
shall  do  this  in  two  ways  for  illustration. 

If  we  compute  the  independence  component  for  each  2x2  table,  we 
get  the  values  shown  in  the  second  column  in  Table  A. 2.  These  computa¬ 
tions  can  be  performed  easily  using  the  SUM  2X(xxxx)LN  X(xxxx)  values 
given  in  the  print-out.  For  example,  the  R  x  C  independence  component 
is : 

NX(IJ.  .) 

2i  -  l  X(lJ..Un - 

J  X(I...)X(.J..) 

and  is  the  difference  of  two  sums. 

2X(IJ..)LN  X(IJ..)  ■  71559.893 

2N  LN  N  81960.898 

153520.791 

,2X(I...)LN  X(I...) 

2X(. J. , )LN  X(.J..) 

21  «  153520.791  -  153234.806  *  285.99  . 

We  note  that  the  sum  of  these  six  components ,  1351.62,  is  much  larger 
than  the  two-way  marginal  effects  value  of  1185.78. 

This  result  shows  up  the  danger  of  looking  at  the  marginal  tables 
one  at  a  time,  even  if  there  is  no  second-order  interaction.  Since  the 
six  two-way  marginal  tables  are  interrelated,  interaction  in  one  two-way 
table  could  conceivably  affect  the  magnitude  of  interaction  in  a 
neighboring  two-way  table,  and  thus  masV.s  the  actual  relationship 
between  these  classifications. 

Next  we  use  the  step-wise  approach;  l.e.,  the  cumulative  addition 
of  one  two-way  marginal  at  a  time,  and  compute  the  discrimination 
information  value  of  the  effect  of  the  m-th  two-way  marginal  over  the 
marginals  that  had  been  fixed  up  to  that  time.  The  values  so  computed 
for  the  selected  sequence  of  marginals  given  in  column  1  are  shown  in 


-  77938.434 

-  75296.372 
153234.806 
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column  3  of  Table  A.  2.  The  total  effect  checks  with  the  value  of 
2nl(p*:p$)  as  it  should. 

-L 

With  six  two-way  marginals,  we  have  a  large  number  (61)  of  ways  to 
order  these  marginals  in  the  sequence.  Corresponding  to  each  sequence 
we  could  compute  a  set  of  information  value  effects.  Obviously  many  of 
these  sequences  are  without  meaningful  interpretation. 

Here  we  shall  appeal  to  the  additional  knowledge  inherent  In  this 
set  of  data;  i.e.,  we  shall  order  the  marginals  to  be  fixed  in  the  same 
order  as  the  sequence  in  time,  and  order  the  "cause  and  effect"  rela¬ 
tionship  ahead  of  "association"  relationship  as  follows: 


2  x  2 


Marginal  tables 


Association  of 


Marginals  fixed 


DT 

accident 

type  -  car 

weight 

( . .KL) , (I. . . ) ,  ( • J. . ) 

CD 

accident 

severity  - 

accident  type 

(..KL),(.JK. ),(!...) 

RC 

ejection 

-  accident 

severity 

( . .KL) , ( ■ JK. ) , (XJ. , ) 

RD 

ejection 

-  accident 

type 

(..KL),(.JK,),(IJ..),(I.K.) 

CT 

accident 

severity  - 

car  weight 

( .  •  KL ) ,  ( .  JK . 't ,  ( 1 J . .  ) , 

(I .K. ) , (. J.L. 

RT 

ejection 

-  car  weight 

all  two-way  marginals 

In  choosing  this  particular  sequence  of  ordering,  we  realize  that 
the  logic  for  the  selection  may  not  be  entirely  free  from  criticism. 
Nonetheless,  this  ordering  appears  to  be  reasonable  for  the  particular 
problem.  In  comparing  columns  2  and  3  in  Table  A. 2,  we  note  that  the 
same  conclusions  will  be.  reached  for  the  first  four  effects,  but  exactly 
opposite  conclusions  are  evidenced  by  the  values  of  the  last  two  effects. 


We  give  in  Analysis  of  Information  Table ^A. 3  a  detailed  analysis  of 
the  no  first-order  interaction  component  2nl(p:p|)  *  2I(xsx|),  using 
x*,  . . . ,  x*  to  denote  expected  cell  frequencies  with  1,2,  ...,  5  two- 
way  marginals  fixed,  and  xi  to  denote  that  of  no  second-order  interaction. 
We  note  that 

xJ(ijkJl)  -  x(i. .  .  )x(.  j  . .  )x(.  .k.)x(.  .  .O/n3, 

and,  by  (2.7), 


x(..k!)  „ 

x*  .  -  xMijkii)  ■  x(.  .ki):c(i. . .  )x(.  j  , .  )/n  , 

a  x*(..kO  1 
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r.x  ( .  .  k  l) 


2l(x*:x*)  -  2^  .,x(..kJi)  *n 

x(.  .k.)x(.  . .  i) 

Similarly,  x£(ijkl)  and  x*(ijkt)  can  also  be  expressed  explicitly  as 

functions  of  the  marginals,  and  the  iteration  process  (2.7)  ends  at  the 
first  cycle,  xjj,  x*,  however,  cannot  be  so  expressed  and  a  number  of 

cycles  of  iteration  are  necessary  to  obtain  the  desired  agreement  among 
the  marginals. 

One  of  the  useful  features  in  these  programs  is  that  the  residuals 
and  the  normalized  residuals  are  printed  out  for  examination.  Column  1 
of  Table  A. 4  shows  the  normalized  residuals  after  all  two-way  marginals 
have  been  fixed.  All  these  residuals  are  small  in  magnitude.  The  largest 
two  are  R(2121)  -  1.084  and  R(2122)  -  -1.819.  The  addition  of  the  three- 
way  marginals  x(.jk£),  x(ijk.)  and  x(i.kP)  in  that  sequence  did  not  change 
the  residuals  by  much.  The  addition  of  the  three-way  marginal  x(ij.i), 
however,  Improves  the  overall  picture  of  these  residuals.  The  information 
value  of  2.928  with  1  d.f.  suggests  that  association  between  ejection- 
accident  severity-car  weight  may  merit  further  investigation. 

There  are  many  ways  to  construct  analysis  of  information  tables  for 
a  four-way  table  -  the  choice  of  which  depends  mainly  on  the  purposes  of 
the  experiment  and  the  hypotheses  of  interest.  In  analysis  of  information 
Table  A. 5  we  give  an  example  for  the  analysis  involving  the  following 
hypotheses : 

H^:  Given  accident  type  and  car  weight,  ejection  is  independent  of 
accident  severity,  or  RxC|DT.  The  appropriate  marginals  to  be 
considered  fixed  here  are  x(i.ki)  and  x(.jkil).  Let  the  expected 
frequencies  under  this  hypothesis  be  denoted  as  x*.  It  con  be 

x(i  ,k£)x( .  Jki) 

verified  that  x*(ijk£)  ■  ■  ■  -  - —  —  ,  and  can  be  computed 

x(..k£) 

directly. 

Since  J^x* (ijk£)  +  x(ij..),  the.  effect  of  the  marginal 

restraint  x(ij..)  has  not  been  taken  in  account  in  H-.  Addition 
of  x(ij..)  to  x(i.kv)  and  x(.jkl)  as  restraints  yields  x*  as 
expected  cell  frequencies  of  independence  of  R  and  D  classifica¬ 
tions  given  the  three  marginals.  The  statistic  2I(x*:x*)  measures 
the  conditional  independence  between  R  and  C  classifScacions  as  in 
H  ,  but  wjrh  the  added  restraint  x(ij..),  i.e.,  the  sum  of  the 
expected  cell  frequencies  over  the  last  two  classifications  must 
satisfy  the  observed  frequencies.  Hence,  we  have 
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H2:  Given  accident  type,  car  weight  and  the  observed  ejection-severity 
frequencies,  ejection  is  independent  of  accident  severity.  The 
difference  beLween  the  two  components  represents 

H^:  The  association  between  ejection  and  accident  severity  is  independent 
of  accident  type  and  car  weight. 

We  note  that  x*  cannot  be  computed  directly  since  x(i.ki),  x(.jki) 
and  x(ij..)  imply  a?l  six  two-way  marginals.  In  analysis  of  information 
Table  A.  6  we  show  that  2I(x:x*)  is  in  fact  a  component  of  three-way 
marginal  effect.  n 

Analysis  of  Information  Table  A.l 

Traffic  accidents  data,  from  Bhapkar  and  Koch  (1968) 

Components  due  to  Information _ Value  D.F, 

No  first-order  interaction  2nl(p:p*)  1193.11  11 

Effect:  all  two-way  marginals  2nl(p*:p*)  1185.78  6 

No  second-order  interaction  2nl(p:p*)  7.33  5 

Effect:  all  three-way  marginals  2nl(p*':p*)  6,66 

No  third -order  interaction  2nl(p:p|)  .67 

Table  A. 2 


Analysis  of  Effect  of  All  Two-Way  Marginals  2nI(pJ:p*) 

Information  value, 
marginals  fixed 


Two-way 

marginal  tables 

Independence  in 
each  2x2  table 

cumulatively  in  the 
sequence  given  at  left 

E.F. 

DT 

52.96 

52.96 

1 

CD 

601.42 

601.42 

1 

RC 

285.99 

286.00 

1 

RD 

401.69 

229.33 

1 

CT 

.76 

14.38 

] 

RT 

8.80 

1.69 

1 

1351.62 

1185.78 

6 
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j-u.M4u.jr  >.,io  wi 


First-order  Intei 


Two-way  marginals  given 


a) 

DT  effect 

interaction 

b) 

DT,CD  effect 

interaction 

<0 

DT,CD,RC  effect 

■interaction 

4) 

DT,CD,RC,RD  effect 

interaction 

e) 

DT,CD,RC,RD,CT  effect 

interaction 

f) 

DT, CD, RC, RD, CT ,  Rf  effect 

interaction 


In Uia  0  j.  UJ I  Xtt 0  le  A  •  3 

'action  2l(x:x*)  =  1193.31 

Information 

D.F. 

2I(V<> 

=  52.95 

1 

•  ar(x:x*) 

■  1140.15 

10 

=  601.42 

1 

2l(x:x£) 

-  538.73 

9 

2KxJ:x*) 

«  286.00 

1 

2l(x:x*) 

“  252.73 

8 

2l(xJ:xJ) 

=  229.33 

1 

2l(x:x*) 

»  23.40 

7 

aiujsxj) 

*  14.38 

1 

a(xsxj) 

«  9.02 

6 

2l(x£;x*) 

a  1.69 

1 

2l(x:x|) 

=  7.33 

5 
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Table  A. 4 


Normalized  Residuals 


All  two-way  marginals 


Add 

x(.Jk-t) 
x(ijk. j 
xfi.k-?) 


All  three-way  marginals 


R(IJH) 


R(uai) 


R(IJ12) 


R(lJ22) 


-.482 

.825 

-.126 

.426 

.086 

-.131 

-473 

-.576 

.487 

-.470 

-.306 

.348 

-351 

-.306 

-.551 

.426 

-.205 

.  1-53 

1.084 

-.387 

1.158 

-.481 

.386 

-.178 

.068 

-.112 

.055 

-.074  11 

-.037' 

.050, 

-374 

-.252 

-.22$ 

.  188 

.154 

-.126 

.293 

-.143 

■  369 

-.219 

.133 

-.080 

-1.819 

.608 

-.860 

.274 

-.331 

.099 

Components  due  to  Information  D.F. 

Second-order  interaction  2l(x:x*)  *»  7-328  5 

CD?.,  ROD,  RDT  effect  3-733  3 

RCT  effect  2.928  1 

Third-order  interaction  .667  1 
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Analysis  of  Informa.tion  Table  A. 5 


Components  due  to  Information _  D.F 


i 


2l(x*:x*)  «  4.859  2 

Third-order  interaction  2l(x:x2')  «  .669  1 
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&!■;<  ECTM!  Pi ;  I  K  i' -an'  OP  PROGfiAM  kkv68a 


R 

c 

0 

T 


□  ^  I  EJECTED 
sr«:.RJiy 
ac c i L>ii  j r  type 
CAR  WLiSrll' 


1-1  NO  1  =  2 

J--1  NOT  SEVERE  J=? 
K-~l  COLLISION  A-? 

L=1  SMALL  L=2 


YES 

SCVt.Rr. 

ROLLOVER 

STANDARD 


or  is Inal  iadles 

X(IJll) 


3 SO  150 
26  23 


X< IJ81) 


60  112 
iy  bo 


X  ( I D 1 2 ) 


10 VS  1022 
111  161 


XUJ22) 


146  404 

?2  265 


I 


mars Inal  tables 

three-way  tables 

X(  I  J.1  :  ) 

222;)  )  ]  M 

13/  ] MM 


16.5 


X<IJ2*  ) 


20;i  bio 
41  34b 


X< . JlL) 


'57o  1939 

173  1103 


X ( . JRL) 


79  170 

1 92  609 


X (1J.L) 


410  2026 

2 62  1426 


X12J.L) 


45  133 

103  426 


X(l. 1L) 


bCJl)  2900 
49  272 


X ( 1 • 2l  ) 


172  bb2 
99  20  7 


TaO-:.7Y  TAdLEib 


V  /  T  i  » 


2^3  u  Jo[)8 
17ci  529 


X< 1 . X.  ) 


3400  724 

3?1  386 


X  ( I .  .  L. ) 


672  3452 

14fc  559 


X(  .  J< • ) 


2365  249 

13!)5  851 


X  ( •  J  •  L ) 

455  2159 

355  1852 

X< . ,<l> 

549  3172 

271  839 

ONO.VAY  TA'J.LE,6 


X  <  J  . . .  ) 


167 


4124  707 


X  (  .  J .  . ) 

2614  2217 

X  ( .  .  <  .  ) 

3721  1110 

X  C  • »  •  L ) 

B20  4011 

TOTAL 

X(....) 

4B31 

print  of  sums 

SUV,  2X(IJKL)LNX(IJKL)=  ,  62B')BS0425756393+  005 

SUM  2  X ( I J< . ) LNX ( I JX • ) -  , 671 B4  604  6730 9B 36+0 OS 

SUV  2X(.  JKL)LNX(.J<L>S  . 663SBC&SB19576S1+00B 

SUV  2  X  (  I  J.  L)  LNX(  1  J.L  ) -■  ,  671 7IU40  4  SB529GG+00  6 

SUV  2X(T.KL)^NX(I  ,<L)=  .  6B7BB  13?.U''Ji>4094  +  00'j 
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V  I  J.  .  /  LW*  (  1  J.  ,  )  - 


.71 5590931 3579511  +  00!} 


BUM  2X ( I  •  K  .  ) LMX (!•<•)= 


.  731323604J055975+OG5 


BUM  2X(I..L>LUX<I..L)= 

BJM  2X<  .  JK.  >LNX(  .  J .  )  - 
BJM  2x( . J.LJLNXt .J.L)= 
SUM  2X(..KL)LNX(..KL)s 
BUM  2X( I.,,) LMX (!..,)= 
SUM  2X ( . U. . ) LMX ( • U. . )  - 

SUM  2X(..K.)LMX(..K.)= 
SUM  2X( . . ,L) LNX(. , .1) = 


•  7354647645009640 <005 
.7069003539101703+006 
.7089636610024700+005 
,7240536400014229+005 
.7793B4 2367626649+005 
. 75296371 55223355+005 
.7675314254771572+005 
.7756015555323097+005 


2M  LM  M  = 


.6146009520312467+005 
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NO  OF  ITFRATlO.\|c.=  1  F Y(~l  F1^ 

A&REE  ViENT  BETWEEN  VAROINALS  TO  .100-02 

Y(I» J»K»L> 

Y(IJU) 

291.734  247.427 

50.014  42.41B 

Y ( I J21) 

67,026  73.609 

14.919  12.654 

Y(IJla) 

1427.005  1210.279 
244,639  207.435 

Y(IJ22) 

425.635  361.035 

72.976  61.694 

2 Y  UN  Y=  .6166539046007271+006 

first-orolr  inter  act  i  cn=  .1193105777+004 

CMI-59JARED=  . 1601729312+004 
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■VJ  Or  IT£K/\TlO.Mr,= 


8  CYCLES 


AGREEMENT  BETWEEN  MARGINALS  TO  ,1.00-02 

r(i»  jirfiD 

Y( I Jll ) 

369.131  140.235 

23.695  25.93b 

Y ( I J21 ) 

57.34b  lib. 289 
14.825  83.540 

Y(IJ12> 

1875.040  1025.594 

107.132  164.235 

Y  ( I J  2  2 ) 

144.404  406.852 

32.345  255.289 

TABLE  OF  RESIDUALS 

rt  < I JJ 1) 

-9.131  9.765 

2.302  -2.935  171 


R<IJ2l) 


*> :  b  -3  -  Oun 

4.174  -3.640 

RC 1 J12) 

2 . 960  “3 • 694 
3.B68  -3.235 

R ( I J22) 

3 « 5lb  — 2.862 

-10.345  9.711 

TA3LE  OF  NORMALIZED  RESIDUALS 

R(IJll) 

-.482  .825 

.473  -.576 

RCIJ21) 

.351  -.305 

1.084  -.387 

R(I J12) 

.068  -.112 

.374  -.252 

R  < I J22) 

.293  -.143 


-1.919  .603 

2 L  L M  2=  .6295117708634006+005 

5EC0N J-ORDER  INTERACTIONS  .7327171723  +  001 

CNI-S3JARED=  .7014267537+001 

NO  OF  ITERATIONS:  7  CYCLES 

AGREEMENT  cJETrtEEN  MARGINALS  TO  .100-02 

Y ( I »  J»  K  »  L ) 

Y(IJll) 

348,393  151.607 

27.60,7  21.393 

Y(IJ2l) 

61.603  110.392 

17,392  81.609 

Y ( 1 J12) 

1879. 60B  1020.392 

109.392  162.608 

Y (1 J22) 

146.392  405.603 

23.509  263.392 
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TABLE  Or  NORMALIZED  RESIDUALS 


R  <  1 J 1 1 ) 


.oa&  -.i5i 

-.30?)  .34a 


R(  IJ2i) 


“.20b  .163 

.3flf>  -.175 


R< IJ12) 


-.037  .050 

.154  -.126 

R< IJ22) 

.133  -.050 

-.331  .099 


LN  W=  * 62057034 64802709+005 

THIRD-ORDER  INTERACTIONS  .66961195360  +  000 


Crll-bQJARE’Js 


.‘>714  044600+000 


1  74 


u. . 


SPECIFIED  -ARGI^ALS  ,,<L  ,JK. 

NO  nr  ITERATIONS*  6  CYCLES 

AGREEMENT  -ETWEEN  MARGINALS  TO  ,  11)0-02 

Y(I  ,  J  |  K  |  L  ) 

Y  (I  J1  I  ) 

32«,621  172.019 

1 9 , 3 1 1  20.047 

Y  (  I  J  2  1  ) 

19,293  I  27, 4, .7  . 

11,  499  02, 741 

Y  <  1  4  1  2  1 

1904,476  ♦•993  , AOS  ‘  * .  . — . . 

111,599  1 42,050 

Y<  I  J221 

I  5  2  ,  A  n  9  3  9  4  ,  A  3  0 

35.599  2 5 A , 1  a  | 

2IU4)  L  N  (H4)i,  ,628 3 51049937586 4 ♦CCS 

HT  0  4  ACT|0  S  (  ■  •  4  )  *  .  2  3  3  9  9  2  6  3  (UU  0  0  2 

'CM  I  •  S  ;}  IJ  A  I'  (..  ■  b  ■  - - ,  2.3523  1^50  9*0(12 
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SPLClFirp  ■-  A  »  G  I  ? !  A  L  s  .  «  <  L  ,JK.  JJ,, 

no  np  JTf.rf«T  lO'-S.  f>  CYCLES 

A  c,  3  F.  fi  *  E  N  T  'Elf  f'FM  M  A  9  t,  J  N  A  L  5  TO  ,100-02 

Y  1  1  ,  J  i  *  ,  L  ) 

Y  (  I  J  U  1 

361,  1  37  l  'O  ,  2  n  a 
21,163  23,363 

Y  <  l  J2  1  ) 

•  ■  i  ’ 

50,909  120.2/6 

13,742  7fl,073  ’ 

vu!m> 

1072.911  1072,615 

1  U  9 , 7  4  0  1  6  6  .  7  .1  5 

Y  (  1  J  ?  2  1 

14  2,994  4  0  ]  ,  9  ?  J 

33,356  260.379 

2  <  l 1 5  )  L  N  ("',}■  ,  6  2  0  (,  9  4  9  3  7  3  9 p.  5 3  3 1  +  HO  5 

1  H  T  r.  w  A  C  1  J  0  "9  I '  1 5  )  a  ,  902CJ5  10*1  in+nu  l 

.  n  7  6  7  2  7  ?  1  5  '■  4  (1  n  1 
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CH  I  -  Spo  A  !>r  "  s 


Si'L1  1 »  1  ;m  <t_.  1*;ALS  , JKL  IJrt.  J.KL 

ti  \J  .L'h  .  1ILi\ATIo  ,5.i-  4  CYCLES. 

A LAEnT.  ^ET'.vEljJ  .SARCiiri/iLS  TO  .  ,100-02 

Y  ( I  r  J  >  K  f  L ) 

Y(IJli)  .  . . 

*  3d:.!  147.032 

23.uj2  2S.3l>P 

Y (IJ213  ... 

64.  '120  107.5)40 

.....14,660- . CJ4...420  _ 

.YILln) 

1 <>76 . 632  1024.306 

116,306  ..  1 9 0 •  .o i 2 

. .YJ1J221 . . . . . 

M  3,461)  4  06.42  0 

<-.0,421.  200.079. 
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1  m  ■!  L  (>i  ij('  -?.-*AL  .1  I  >'  z.*j  I  l.IJAI.S 


SPECIFIED  MARGINALS  I «KU  • JKL  !,)♦♦ 

NO  or  ITERATjOnSb  4  CYCLES 

AGREEMENT  between  MARGINALS  TO  *100-02 

YU'JtK'L) 

Y  (  1  J  1 1  ) 

353.346  144,639 

22*634  26,366 

u  U2i  r 

62,331  109,669  "  ~  - - - 

16,669  62.931 

Yl  I JI2) 

1881,006  1018,914 
107,4|4  |64, O07 

Y  (  I J22  1 

139,217  412,763  - 

30,783  256,2(7 

2(01)  LN  (U1  )  ■  ,6285297612235454*005 

INTERACTIONS  (u I ) ■  ,5528135209*001 

CHI-SQUAREO- . . ,5276909156*001 
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APPENDIX  B 


Setup  for  Data  Cards  for  KKV68A  and  KKV6SB 


All  data  values  must  be  right  adjusted  in  the  fields  specified. 

(1)  First  card  has  the  value  of  N  punched  in  columns  1-5. 

N  ■  2  for  the  two-way  table 

N  -  3  for  the  three-way  table 

N  ■  4  for  the  four-way  table 

(2)  The  next  set  of  data  consists  of  N  descriptive  title  cards.  The 

information  may  be  punched  in  columns  1-72. 

(3)  The  next  card  consists  of  data  in  the  following  columns: 

Col.  1-5  Ml  ■  number  of  categories  In  the  row  classification 

Col.  6-10  M2  »  number  of  categories  in  the  column  classification 

Col.  11-15  M3  ■  number  of  categories  in  the  depth  classification 

Col.  16-20  M4  -  number  of  categories  in  the  level  classification 

Col.  21-25  NSETS  -  number  Of  specified  sets  of  marginals  £  5 
For  program  KKV68B,  NSETS  ■  0. 

Col.  26-30  ITMAX  -  maximum  cycles  of  iterations 
Col.  31-50  CONST  -  tolerance  required  between  marginals 
Col.  51-55  IFCR  -  1,  compute  F(IJKL)  -  n/rcdt 
1  IFCR  -  2,  input  F(IJKL) 

Col.  56-60  IRCD  -  1,  coefficients  not  calculated 
IRCD  -  2,  coefficients  calculated 
Col.  61-65  IPRINT  «  1,  print  residuals 

IPR1NT  “2,  print  normalized  residuals 
IPRINT  -  3,  print  both 

If  IRCD  -  2,  IPRINT  has  no  effect. 

For  program  KKV68A,  cols-  51-65  are  not  used. 

(4)  The  next  group  of  data  cards  consists  of  the  specified  marginals  if 
NSET  i  0  in  program  KKV68A. 

The  first  card  contains  NMARGS  in  col.  1-5.  NMARGS  <_  6. 

The  second  card  contains  MARGN  in  6A4  format. 

MARGN  is  specified  as,  e.g.,  IJ. . 

(5)  The  next  set  of  cards  is  the  input  X(XJKL)  specified  using  format 
7D10.0. 

(6)  If  IFCR  -  2  in  program  KKV68B,  the  next  set  of  cards  are  the  F(IJKL) 
values  in  7D10.0  format.  The  cards  for  (5)  and  (6)  are  punched  by 
column  with  each  row,  row  x  column  within  each  depth,  and  row  x  column 
by  depth  within  each  level. 

Repeat  from  (1)  through  (6)  for  each  set  of  data  to  be  analyzed. 
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NEYMAN  AWARDED  THE  1968  SAMUEL  S.  WILKS  MEMORIAL  MEDAL 

The  Recipient  of  the  Fourth  Samuel  S.  Wilke  Award 
Announced  by  Frank  E.  Grubba 


Professor  Jerzy  Neyman  of  the  University  of  California ,  Berkeley, 
has  been  awarded  the  Samuel  S.  Wilks  Memorial  Medal  for  1968.  The 
announcement  of  Professor  Neyman' s  selection  for  the  1968  Wilks  Award 
was  one  of  the  highlights  of  the  Fourteenth  Annual  Conference  on  the 
Design  of  Experiments  in  Army  Research,  Development,  and  Testing, 
which  was  held  at  the  Army  Chemical  Center,  Maryland,  23-25  October 
1968.  Professor  Neyman  has  long  been  recognized  as  one  of  the  fore¬ 
most  statisticians  in  the  entire  world,  having  made  many  fundamental 
contributions  to  the  theory  and  application  of  statistical  methodology. 
The  citation  for  Professor  Neyman  reads  as  follows: 

"To  Professor  Jerzy  Neyman,  whose  extensive  contributions  both  to 
the  theory  and  practice  of  statistics  have  led  to  fundamental  changes 
in  the  thinking  and  methodology  of  scientists  all  over  the  world.  He 
has  inspired  and  led  more  than  a  generation  of  students  and  his  continued 
leadership  is  effective  today.  Both  by  precept  and  by  example,  he  is  one 
of  the  foremost  statisticians  in  the  entire  world." 

The  Samuel  S.  Wilks  Memorial  Medal  Award  is  administered  by  the 
American  Statistical  Association,  a  non-profit,  educational  and 
scientific  society  founded  in  1839.  the  Wilks  Award  is  given  each 
year  to  a  statistician  and  is  based  primarily  on  his  contribution  to 
the  advancement  of  scientific  or  technical  knowledge  in  Army  statistics, 
ingenious  application  of  such  knowledge,  or  successful  activity  in  the 
fostering  of  cooperative  scientific  matters  which  coincidentally  benefit 
the  Army,  the  Department  of  Defense,  and  the  Government. 

The  Award  consists  of  a  medal,  with  a  profile  of  Professor  Wilks 
and  the  name  of  the  Award  on  one  side ,  and  the  seal  of  the  American 
Statistical  Association  and  name  of  the  recipient  on  the  reverse;  a 
citation,  and  an  honorarium  related  to  the  magnitude  of  the  Award  funds. 
The  Annual  Design  of  Experiments  Conferences,  at  which  the  Award  is 
given  each  year,  are  sponsored  by  the  Army  Mathematics  Steering  Com¬ 
mittee  on  behalf  of  the  Office  of  the  Chief  of  Research  and  Development, 
Department  of  the  Army. 

The  funds  for  the  Wilks  Memorial  Award  were  donated  by  Philip  G. 
Rust,  Thomasvllle,  Georgia. 

With  the  approval  of  President  Geoffrey  Moore  of  the  American 
Statistical  Association,  the  Wilks  Award  Committee  for  1968  consisted 
of  the  following: 
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Professor  Robert  E.  Bechhofer 
professor  w-i  1 1 1  am  c.  Cochrsn 
Dr.  Francis  G.  Dressel 

Dr.  Churchill  Eisenhart 
Professor  Oscar  Kempthorne 
Dr.  A’  xander  M.  Mood 
Major  General  Leslie  E.  Simon 
Dr.  Frank  E.  Grubbs,  Chairman 


Professor  Jerzy  Neyman  was  born  in  Bendery,  Bessarabia  of  Polish 
parents.  He  was  educated  in  Russia  at  the  University  of  Kharkov  and 
when  Poland  again  became  an  independent  state,  he  went  to  Warsaw  where 
he  received  the  Ph.D.  degree  from  the  University  of  Warsaw,  He  held 
several  positions  in  Poland:  as  a  Lecturer  at  the  University  of  Warsaw 
and  the  University  of  Cracow  and  was  Head  of  the  Biometric  Laboratory  of 
the  Nencki  Institute  in  Warsaw.  Dr.  Neyman  received  a  Rockefeller 
Fellowship  which  allowed  him  to  study  at  the  University  of  Paris  and 
at  University  College,  London.  In  1934,  he  became  a  member  of  the  staff 
at  University  College,  remaining  there  until  1938  when  he  went  to  the 
University  of  California,  Berkeley  as  Professor  of  Mathematics.  He  has 
remained  at  Berkeley  for  30  years  as  the  Director  of  the  Statistical 
Laboratory  and  Professor  first  of  Mathematics  and  then,  in  1955,  when 
the  Statistics  Department  was  established,  as  Professor  of  Statistics. 
Professor  Neyman  has  been  a  Visiting  Lecturer  at  many  universities  in 
the  Uni tsd  States  and  abroad.  He  is  now  Professor  Emeritus  recalled  to 
active  duty  and  Director  of  the  Statistical  Laboratory. 

Professor  Neyman  has  received  many  swardB  and  honors  including  an 
honorary  degree  from  the  University  of  Chicago,  the  University  of 
California,  and  the  University  of  Stockholm.  He  has  also  received  the 
Guy  Medal  In  Gold  of  the  Royal  Statistical  Socisty,  (London,  England) 
the  Newcomb  Cleveland  Award  of  the  American  Association  for  the  Advance¬ 
ment  of  Science  and  the  Centennial  Award  of  the  Academic  Senate  of  the 
University  of  California  at  Berkeley.  In  1963,  he  was  elected  to  the 
National  Academy  of  Sciences,  U.S.A.,  and  as  a  foreign  member  of  the 
Royal  Swedish  Academy.  In  1966,  he  was  elected  to  foreign  membership 
in  the  National  Academy  of  Science  of  Poland.  He  was  elected  an  honorary 
member  of  the  International  Statistical  Institute,  a  Fellow  of  the 
American  Statistical  Association,  of  which  he  was  Vice  President 
1947-48,  and  a  Fellow  of  the  Institute  of  Mathematical  Statistics  of 
which  he  was  President  In  1949.  Dr.  Neyman  is  a  Fellow  of  several  other 
societies  including  the  Econometric  Society,  tne  Biometric  Society,  the 
Mathematical  Society  of  France,  and  the  Polish  Mathematical  Society.  He 
is  a  member  of  several  other  mathematical  societies  and  of  several 
astronomical  societies  Including  the  International  Astronomical  Union. 

He  is  President-elect  of  the  International  Association  for  Statistics 
in  Physical  Sciences. 


Cornell  University 


Duke  University  and  the 
Army  Research  Off ice-Durham 
National  Bureau  of  Standards 
Iowa  State  University 
University  of  California 
Retired 

Aberdeen  Research  and 
Development  Center 
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rrotessor  Neyman's  research  can  be  divided  into  three  parts.  First, 
he  worked  In  pure  mathematics,  but,  then  beginning  In  the  late  1920's, 
he  turned  to  the  theory  of  statistics.  He  developed,  jointly  with  E.  5. 
Pearson,  the  theory  of  testing  hypotheses  and  also  developed  the  theory 
of  confidence  intervals.  Even  while  Dr.  Neyman  was  in  Poland  -uid  in 
England,  he  was  concerned  with  the  application  of  statistics.  After  he 
came  to  the  United  States,  his  interests  turned  more  towards  applications 
of  statistics,  especially  sampling  theory  and  applications  in  the  various 
sciences  including  astronomy,  biology,  and  health  and  weather  modification. 
His  principal  recent  theoretical  work  has  been  the  development  of  the 
C(a)  test  for  testing  composite  hypotheses. 

During  World  War  II,  Professor  Neyman  and  the  Berkeley  Statistical 
Laboratory  worked  on  tactical  problems  of  the  Air  Force  under  the  National 
Defense  Researcn  Committee.  He  has  served  on  many  committees  concerned 
with  statistics  in  branches  of  the  Government,  in  scholarly  societies  and 
in  education. 

Professor  Neyman  is  the  author  of  Lectures  and  Conferences  and  of 
First  Course  in  Probability  and  Statistics.  He  is  the  editor  of 
Bernoulll-Baves-Laplace  Jubilee  Volume  and  of  the  Proceedings  of  the 
Berkeley  symposia,  which  now  amounts  to  17  volumes  running  to  more  than 
7,600  pages.  In  addition,  he  is  the  author  or  co-author  of  more  than 
200  scientific  papers  in  scholarly  journals.  As  noted  above,  his 
publications  form  ths  very  basis  of  modern  testing  hypotheses  and 
interval  estimation.  Indeed,  they  are  now  regarded  as  classical  sad 
the  earlier  papers  have  been  republished  Jointly  by  the  Cambridge 
University  Press  and  the  University  of  California  Press  in  two  volumes: 
one  contains  the  paper  Joint  with  E.  S.  Pearson;  the  other  contains  the 
remaining  important  papers  published  before  1945.  Several  of  his  books 
and  papers  have  been  translated  into  Spanish,  Polish,  and  Russian. 

It  Is  probably  correct  to  state  that  Professor  J.  Neyman  is  one  of 
the  moat  outstanding  statisticians  in  the  world  today,  due  not  only  to 
his  extremely  important  basic  contributions,  but  also  to  his  great 
activity  in  using  the  fundamental  concepts  in  many  fields  of  applica¬ 
tion  and  in  constructing  stochastic  models  with  such  diverse  and  important 
phenomena  as  a  two-stage  theory  of  carcino-genesis  and  the  distribution 
of  galaxies  in  space. 

Professor  Neyman  has  many  students  and  by  now  grand  students  and 
great-grand  students  all  over  the  world.  Almost  all  of  his  students  in 
Poland  were  killed  by  the  Nazi  invasion.  However,  since  he  attracts 
students  to  Berkeley  from  every  country,  there  is  by  now  a  new  generation 
of  Polish  students  of  Professor  Neyman's.  Today,  his  doctoral  students 
are  working  in  theoretical  statistics,  in  problems  arising  in  the  design 
of  weather  modification  experiments,  in  carcino-genesis,  in  the  transfer 
of  memory,  and  in  several  Intricate  problems  in  cell  biology. 
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Professor  Neyman  Is  admired  by  his  colleagues  and  his  students  in 
creating  a  stronger  science,  a  more  meaningful  education,  and  a  be l Let 
world  in  which  to  live.  7n  addition  to  the  high  esteem  of  his  colleagues 
and  students,  Neyman  enjoys  their  affection.  Distinguishing  characteristics 
are  his  intellectual  Inspiration  and  dedication. 


PROBLEMS  IN  EVALUATING  TREATMENT  RESPONSE 
OVER  UNEQUALLY  SPACED  TIME  INTERVALS 

Cerhard  J.  Isaac 

U.S.  Army  Medical  Research  and  Nutrition  Laboratory 
Denver,  Colorado 

ABSTRACT.  Test  subjeots  are  first  conditioned  and  then  various 
physiological  parameters  are  measured  at  a  sea  level  location  to  provide 
base-line  values.  After  moving  to  high  altitude  the  measurements  are 
repeated  several  times  at  unequally  spaaed  time  intervals.  Final 
measurements  are  made  upon  return  to  sea  level.  Interest  exists  in  such 
findings  as  the  initial  impact  of  high  altitude  exposure ,  possible  adjust¬ 
ment  to  altitude  and  effect  of  return  to  sea  level,  i/hat  statistical 
analysis  will  provide  the  moat  appropriate  basis  for  inferences  about 
the  questions  of  interest?  Analyses  considered  include  analysis  of 
variance  and  paired  t  tests  against  control. 

The  problem  I  am  going  to  outline  for  you  became  of  special  Interest  to 
me  In  connection  with  certain  high  altitude  studies  carried  out  by  our 
Laboratory.  Basically  these  studies  involved  the  moving  of  test  subjects 
from  sea  level  to  high  altitude  (14,100  ft.)  and  back  to  sea  level.  A 
primary  interest  was  in  the  effect  of  altitude  on  performance.  Also  of 
interest  were  possible  explanations  of  the  physiological  basis  for  changes 
in  performance  and  in  ways  df  ameliorating  the  effectB  of  abrupt  movement 
to  high  altitude. 

Parameters  selected  for  measurement  include  those  which  prior  studies, 
or  knowledge  of  physiological  processes,  suggest  may  be  responsive  to 
changes  in  altitude.  Initial  measurements  made  at  sea  level,  after  a 
period  of  training,  provide  the  control  of  base-line  values  for  each  subject. 
The  effects  of  altitude  are  reflected  in  subsequent  measurements  on  selected 
days  at  altitude.  This  might  follow  a  pattern  like  days  1,  3,  7  and  14 
after  arrival  at  altitude.  Final  measurements  are  made  upon  return  to  sea 
level. 

An  appropriate  statistical  analysis  is  desired  to  provide  a  basis  for 
answers  to  a  series  of  questions  about  the  parameters  measured.  Inference 
drawn  will  reflect  not  only  comparisons  among  the  findings,  but  also  will 
deal  with  the  physiological  aspects  of  the  parameters.  In  a  given  case 
the  fact  of  signigicant  change  may  be  more  Important  than  the  direction 
of  change,  though  direction  also  may  be  of  concern. 

It  may  be  useful  to  list  some  of  the  questions  that  arise  in  a  study 
of  this  kind. 

1.  What  is  the  initial  Impact  of  a  move  to  high  altitude? 

a)  Is  the  parameter  significantly  modified  in  any  way  by  the  change 
in  environment? 
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2.  What  Is  the  effect  of  remaining  at  high  altitude? 

a)  If  the  initial  impact  is  a  modification  ot  the  control  values, 
do  they  tend  to  subsequently  return  to  normal,  do  they  tend 

to  modify  further  or  do  they  remain  about  aa  Initially  modified? 

b)  If  there  is  no  significant  initial  impact  at  altitude,  is  there 

a  tendency  for  values  to  change  gradually  as  exposure  to  altitude 
continues?  Is  there  reason  to  believe  there  is  a  training  effect, 
an  adjustment  to  altitude,  or  that  some  other  factor  is  operating? 

3.  What  is  the  effect  of  returning  to  sea  level  after  a  period  at  altitude? 

a)  Is  there  an  immediate  return  to  control  levels? 

b)  Is  there  a  delayed  return  to  control  levels? 

c)  Do  values  find  a  level  different  from  both  control  and  altitude 
levels,  as  in  the  case  of  a  continuing  training  effect? 

The  obvious  overall  problem  Is  that  of  selecting  statistical  procedures 
♦-hat  are  both  valid  and  appropriate  for  testing  the  various  hypotheses 
implied  in  the  series  of  questions  posed.  Also  to  be  considered  is, 
which  statistical  procedures  will  contribute  the  most  toward  extracting 
the  maximum  amount  of  useful  information  from  the  data. 

Let  us  consider  first  a  statistical  evaluation  that  starts  with  an 
analysis  of  variance.  This  permits  Inference  regarding  the  presence  of 
significant  differences  among  the  means  for  the  measurement  days.  At  this 
point,  however,  there  is  no  direct  information  as  to  which  differences 
between  days  are  significant.  For  example,  we  have  to  look  further  for 
information  about. the  significance  of  the  Initial  Impact  of  moving  to 
altitude  as  reflected  in  the  control  measurement  and  the  first  one  at 
altitude.  Furthermore,  the  site  of  the  difference  between  these  measurements, 
or  any  other  pair,  may  reflect  both  treatment  effect  and  the  number  of  days 
elapsed  between  measurements. 

At  this  point  I  became  disturbed  at  the  implications  of  using  an  overall 
anova  as  the  basis  for  some  critical  difference  which  would  be  used  to  test 
for  significant  differences  between  various  means.  The  difficulties  seemed 
to  be  much  the  same  whether  I  thought  of  unusual  variability  among  the 
subjects  at  this  time  because  of  Occidents  of  selection,  variability  in 
the  state  of  conditioning  or  adjustment  to  the  test  procedures.  Of  course, 
especially  when  numbers  are  small,  the  usual  observations  about  the  paired 
t  procedure  are  in  order.  A  minus  factor  is  the  loss  of  degrees 
of  freedom,  and  a  plus  factor  is  the  incorporation  into  the  calculation 
of  the  correlation  between  the  two  sets  of  response  data.  Thus  this 
analysis  places  a  premium  upon  consistency  of  direction  and  extent  of 
change  among  the  subjects.  As  we  well  know,  if  all  subjects  tend  to  move 
in  the  same  direction  and  in  about  the  same  proportion,  even  a  small  relative 
change  may  show  up  as  highly  significant.  But  Buch  a  comparison  utilizes  only 
a  portion  of  the  data  in  determining  the  error  component,  whereas  anova 
utilizes  the  entire  set  of  data. 


The  mein  argument  for  che  paired  t  test  seemed  to  lie  in  the 
J Li ei mess  of  the  mterences  that  could  be  drawn.  The  test  between 
control  and  initial  altitude  values  would  provide  an  answer  to  the 
question  about  possible  signif leant  changes  due  to  the  initial  impact 
of  altitude.  Comparisons  with  subsequent  days  would  reveal  if  significant 
changes  persisted  and  for  how  long.  Or,  in  the  case  of  a  delayed  reaction, 
when  a  significant  difference  developed.  Comparisons  between  control  and 
final  sea  level  values  would  reveal  the  extent  to  which  there  was,  or 
was  not,  a  return  to  original  levels.  This  would  reflect  possible  training 
effects  or  carryover  effects  related  to  the  stay  at  altitude.  Similar 
paired  t  tests  made  between  altitude  days  would  throw  light  on  the  effect 
of  sustained  living  at  altitude.  Or  comparisons  could  be  made  between 
final  altitude  and  final  sea  level  measurements. 

It  would  seem  that  in  the  paired  t  approach  the  emphasis  is  on 
changes  in  the  levels  of  the  measurements  under  the  various  test  and 
there  is  a  minimum  concern  over  the  length  of  time  intervals  between 
measurements. 

An  extension  of  the  problem  occurs  when  the  test  subjects  ara  subdivided 
into  treatment  groups.  A  common  procedure  is  to  put  all  subjects  through 
a  conditioning  program  at  aea  level  before  making  control  measurements.  When 
the  subjects  are  moved  to  altitude  one  group  may  be  fed  a  diet,  or  a  drug, 
that  it  ia  hoped  will  mitigate  aome  of  the  undesirable  responses  to  altitude 
exposure.  If  the  randomisation  proceas  used  to  make  assignments  to  the 
treatment  groupa  is  successful,  the  control  values  of  the  groups  will  be 
in  close  agreement.  By  tha  same  token,  if  the  treatment  is  successful, 
there  will  be  divergence  at  altitude,  and  possibly  a  return  to 
agreement  when  again  measured  at  aea  levels 

A  two-way  anova  can  ba  performed  but  question*  of  logic  arise  because 
of  the  patterns  in  the  responses.  Often,  the  response  curve  la  essentially 
parabolic  in  form  and  la  anchored  at  control  and  final  aea  level  values. 

Only  the  "middle"  le  really  subject  to  treatment  response.  It  would  seem 
that  this  would  lead  to  understating  tha  average  difference  between  groupa 
because  the  "end"  values,  by  design,  have  minimal  variation,  whereas,  treatment 
response,  if  present,  ia  concentrated  In  tha  "middle"  values. 

It  would  appear  that  there  are  several  options  for  approaching  this 
problem.  If  there  are  only  two  treatments,  it  would  seam  appropriate  to 
first  run  a  t  teat  of  differences  between  treatments  at  control  and  at 
return  to  sea  level.  Non-significant  differences  at  final  see  level  might  be 
said  to  confirm  this.  On  the  other  hand  there  could  be  significant  differences 
because  of  treatment  carry-over  effects  on  a  particular  parameter.  Or  for 
that  matter,  training  effects  not  related  to  treatment  could  also  be  present. 

By  means  of  t  teats,  the  differences  between  treatment  groupa  could  be 
evaluated  at  any  time-point.  It  le  possible  to  determine  If  signii leant 
differences  between  treatments  appear  ae  a  reflection  of  the  initial  impact 
of  altitude,  staying  at  altitude,  or  returning  from  altitude  to  sea  level. 
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complex.  It  would  appear  appropriate  to  evaluate  differences  at  each  time 
period  separately.  Anova  is  a  possibility.  This  could  be  followed  by  some 
of  the  tests  of  all  differences  between  means,  to  find  which  treatment 
differences  contribute  the  most  to  overall  variability. 

Under  either  approach,  when  treatment  groups  differ  significantly  at 
control,  no  simple  answer  follows  from  comparisons  at  later  times.  If 
significance  disappears  at  subsequent  dates,  it  may  be  that  the  passage 
of  time,  training  or  adaptation  to  environment,  tend  to  make  the  response 
in  the  treatment  groups  the  same.  If  the  groups  are  significantly  different 
on  all  measurement  days,  the  interpretation  is  at  best  ambiguous.  It  could 
be  that  all  of  the  test  subjects  happen  to  be  responsive  to  altitude  in 
the  same  way  regardless  of  treatment.  Quantitatively  the  values  may  be 
at  different  levels  for  the  various  treatments.  Again  this  could  be  due 
to  chance,  or  poor  judgment  (or  lack  of  randomness)  in  assigning  the  test 
subjects  to  treatment  groups. 

The  problem  we  have  been  considering  is  not  unique  to  the  experiments 
I  have  been  using  as  examples.  There  are  parallels  m  other  areas.  A 
common  experimental  procedure  in  nutrition  research  is  to  feed  test  and 
normal  diets  to  groups  of  rats  during  their  most  active  growth  period. 

This  may  cover  a  period  of  8  to  12  weeks  immediately  after  weaning.  Comparison 
of  the  growth  curves  during  this  period  is  one  way  of  evaluating  the 
response  to  the  test  diet.  Usually  initial  group-average  weights  are  very 
close  together.  This  is  partly  by  intent,  and  is  accomplished  in  any  of 
several  ways.  The  experimental  animals  may  be  purchased  under  specifications 
limiting  the  weights  to  a  fairly  narrow  range.  The  animals  nay  be  assigned 
to  treatment  groups  entirely  at  random,  or  arrayed  by  weight  and  weight 
pairs  distributed  to  treatments  randomly.  Either  method  usually  results  in 
treatment  averages  that  agree  closely. 

The  experimenter  may  be  interested  in  either  the  final  weights  or  in 
the  route  by  which  they  got  there.  With  initial  weights  not  significantly 
different,  the  final  weights  for  the  two  treatment  groups  can  be  examined. 

A  t  test  of  the  difference  between  the  group  means  seems  appropriate.  If 
the  difference  is  significant,  there  may  be  Interest  in  when  this  became 
apparent.  In  a  12-week  experiment,  differences  might  be  tested  at  6,  8  and 
10  weeks  to  locate  when  the  divergence  became  significant.  Actual  times 
could  be  selected  from  examination  of  the  raw  data  in  chart  form.  In  most 
of  these  experiments  the  precise  shape  of  the  growth  curves  is  less  interest 
than  evidence  of  significant  divergence.  In  any  event,  it  would  be  possible 
to  establish  whether  differences  became  significant  early  or  late  in  the 
experiment.  Also  in  the  case  of  non  significance  of  final  differences, 
it  might  be  useful  to  know  if  significant  differences  appeared  at  midpoint 
and  then  disappeared  as  the  laggards  "caught  up."  In  general,  however,  the 
primary  emphasis  has  been  on  differences  and  not  on  levels  of  weight 
achieved  at  any  particular  time.  In  some  experiments,  such  as  involving 
mature  animals,  there  could  be  interest  in  level  changes  within  treatments 
as  well  as  in  differences  between  treatments. 
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To  get  back  to  the  altitude  problem.  AT*A  At* Hat*  a  1  t-arno  Miraa* ? 

One  that  occurs  to  me  is  to  express  a  parameter  in  some  other  form  to 
facilitate  comparison.  Perhaps  values  for  each  subject  expressed  relative 
to  his  control  as  100  percent  might  permit  meaningful  evaluation.  Or 
would  ic  suffice  in  a  given  case  simply  to  note  that  under  one  dietary 
regimen,  values  at  altitude  are  not  significantly  different  from  control 
while  under  another  regimen,  they  are. 

I  have  not  found  a  satisfactory  and  definitlva  answer  of  unlverssl 
application  in  experiments  of  the  kinds  I  have  used  as  illustrations. 

It  would  seem  that  a  large  dose  cf  judgment  is  essential  to  guide  a 
statistical  evaluation  of  this  kind.  The  reaction  of  the  panel  to  the 
various  possibilities  is  solicitad.  Suggestions  for  entirely  different 
approaches  also  are  in  order. 
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ANALYSIS  OF  DATA  FROM  THE  WOUND  DATA  AND 
MUNITIONS  EFFECTIVENESS  TEAM  IN  VIETNAM 


W.  Bruehey,  L.  Sturdivan,  and  R.  Whitmire 
Terminal  Ballistics  Laboratory 
U.  S.  Army  Aberdeen  Research  and  Development  Center 
Aberdeen  Proving  Ground,  Maryland 


I.  INTRODUCTION.  Efforts  of  the  U.  S.  Army  to  gather  data  on  wound 
ballistics  dates  back  into  the  19th  century.  In  "modern"  times,  the 
laboratory  experiments  have  been  supplemented  by  data  gathered  in  the 
battlefield.  We  refer,  of  course,  to  the  U.  S.  Army  Surgeon  General's 
report  on  a  number  of  engagements  in  the  second  World  War  and  Korea, 
entitled,  "Wound  Ballistics."  Until  the  Vietnam  conflict,  however, 
efforts  to  collect  field  data  were  rather  limited  in  both  breadth  and 
depth.  In  August,  1966,  the  Vice  Chief  of  Staff,  U.  S.  Army,  made 
known  the  requirements  that  a  study  be  undertaken  to  gather  data 
pertinent  to  evaluating  the  effectiveness  of  antipersonnel  munitions 
deployed  in  Vietnam,  including  a  comprehensive  study  of  wounds  and 
post-wounding  behavior  of  resultant  casualties.  The  Wound  Data  and 
Munitions  Effectiveness  Team,  called  WDMET,  was  organized  to  fulfill 
this  mission.  A  data  collection  format  with  eleven  (11)  sections 
dealing  with  specific  areas  of  interest  was  compiled  from  the  require¬ 
ments  of  relevant  government  agencies.  A  team  of  forty-three  (43)  men 
wich  various  military  specialties  wan  given  training  in  ballistics, 
wound  ballistics  and  collection  procedures  in  late  April  and  early  May, 
1967  at  the  Army  Chemical  Center,  Edgewood  Arsenal,  Maryland.  By  late 
July,  1967,  the  Team  was  in  operation  in  Vietnam.  Another  group  of 
about  ten  (10)  men  was  assigned  to  Edgewood  Arsenal  as  a  center  for 
receiving,  processing  and  analyzing  the  data  from  the  Vietnam  Teams. 

In  conjunction  with  the  Wound  Ballistics  Group  of  the  Ballistic  Research 
Laboratories  (BRL) ,  Aberdeen  Proving  Ground,  a  complete  system  for 
storage,  retrieval,  and  analysis  of  the  WDMET  data  was  designed  for 
the  BRL  electronic  computer,  BRLESC. 

The  WDMET  Team  in  Vietnam  was  organized  into  four  (4)  sections: 

(1)  Headquarters  and  Support  Section  in  Saigon;  (2)  Section  1  in  An  Kae, 
following  units  of  the  1st  Air  Cavalry;  (3)  Section  II  at  Cu  Chi,  covering 
elements  of  the  25th  Infantry  Division;  and,  (4)  the  Pathology  Section  at 
the  Saigon  Mortuary.  Each  section  studies  American  casualties  from  a 
battalion-sized  unit.  Section  I  reported  on  100%  of  the  casualties  in 
its  selected  units,  while  Section  II  covered  all  casualties  in  selected 
engagements.  The  Pathology  Section  autopsied  the  "killed-in-action"  and 
"died-of-wounds"  casualties  which  had  been  covered  by  the  field  teams. 

In  addition,  they  performed  autopsies  on  selected  cases  not  covered  by 
the  field  sections,  but  which  could  contribute  to  fulfilling  the  WDMET 
mission. 


Due  Co  lVms  it iragular  character  of  the  ?.•>«  f  n  .-*■  ir  Vietnam,  most 
cases  contribute  useful  information  to  only  a  small  part  of  the  WDMET 
area  of  Interest.  The  cases  are  not  randomly  selected,  and  they  are 
"typical"  only  insofar  as  the  war  in  the  two  areas  covered  is  typical 
of  Vietnam  as  a  whole.  The  enemy's  weapons  are  often  improvised  or  not 
seen,  making  identification  or  characterization  of  the  weapon  difficult 
or  impossible  in  those  cases.  This  type  of  data  wsb  collected  under 
most  unfavorable  conditions,  to  say  the  least.  The  WDMET  personnel,  of 
course,  were  never  allowed  to  interfere  with  the  mission  of  the  units 
being  covered  or  with  the  proper  medical  treatment  of  the  wounded.  Data 
could  not  be  obtained  until  the  engagement  was  terminated. 

In  addition  to  the  problems  inherent  in  the  method  of  data  acquisition, 
biases  are  present  in  the  data  selection  procedures.  As  stated  previously, 
Field  Section  1  attempted  to  get  information  on  all  casualties  from  a 
selected  unit.  This  was  done  whenever  feasible.  However,  due  to  the 
limited  number  of  personnel  available  for  data  acquisition  and  the  nature 
of  the  Vietnam  conflict,  there  were  periods  of  intense  activity  during 
which  it  was  not  possible  to  cover  every  casualty.  There  was  necessarily 
some  case  selection  on  an  individual  basis  with  the  team  member  forced  to 
select  the  "most  valuable"  cases.  Field  Section  II,  on  the  other  hand, 
selected  incidents  from  which  all  casualties  were  covered.  Incidents 
were  generally  selected  on  the  basis  of  weapon  or  weapons  involved  and 
the  availability  of  information. 

As  the  completed  casualty  reports  are  received  at  WDMET (C) ,  Edgewood 
Arsenal,  they  are  coded  onto  punch  cards  and  are  submitted  to  the  Wound 
Ballistics  Group  at  BRL  for  processing  on  the  BRLESC  computer.  The 
actual  processing  of  the  coded  information  1b  handled  by  three  separate 
computer  programs:  (1)  an  error  checking  program;  (2)  a  prlnt-out  pro¬ 
gram;  and,  (3)  an  analysis  program.  The  error  checking  program  checks 
for  selected  punching  errors  in  the  input  data;  the  print-out  program 
reads  the  input  data  and  produces  a  narrative  print-out  for  each 
casualty  report.  It  is  the  third  program  which  is  of  interest  here. 

At  present,  the  analysis  program  produces  a  simple  enumeration  of 
the  frequency  of  occurrence  of  the  various  factors  contained  in  the 
study  and  correlations  among  the  factors  (frequency  of  occurrence  of 
two  or  more  factors  in  the  same  case) .  Presented  here  is  a  selection 
of  correlations  generated  by  this  computer  program.  These  specific 
correlations  were  chosen  because  of  their  interesting  content  and  the 
likelihood  that  they  would  contain  the  largest  number  of  data  points 
for  our  limited  size  sample. 

Our  problem  centers  about  the  interpretation  of  selected  correlations 
and  methods  of  determining  if  significant  differences  exists  between  the 
distribution  of  groups  of  data  as  it  is  received  from  Vietnam. 
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II.  FREQUENCY  OF  OCCURRENCE  OF  FACTORS.  The  eleven  (11)  eectlone 
of  the  data  collection  format  are  composed  primarily  of  coded  information. 
For  example,  the  type  weapon  carried  by  the  casualty  was  coded  as  follows: 


Code  No, 

0 

1 

2 

3 

4 

5 

6 
7 


Weapon 

Unknown 

M16 

M14 

M79 

M60 

Mortar 

Rocket 

Other 


In  this  manner,  it  was  possible  to  determine  if  a  factor  occurred  and 
how  often.  The  following  three  tables  represent  this  type  of  enumeration 
of  factors. 


The  data  presented  in  Table  1  represents  the  number  of  casualties 
associated  with  each  injury  type  for  the  first  930  cases  received  and 
coded  onto  punch  cards  by  WDMET(C).  The  totals  on  injury  type  come  to 
904  cases;  adding  to  this  the  26  cases  which  had  no  Information  on 
injury  type  bring  the  total  to  930.  It  should  be  understood  that  tills 
distribution  is  not  truly  representative  of  the  Vietnam  conflict  as  a 
whole  in  that  there  is  a  higher  percentage  of  fatalities  than  is  found 
in  the  casualty  distributions  as  compiled  by  the  Office  of  the  Surgeon 
General.  This  is  due  primarily  to  the  fact  that  the  personnel  located 
at  Saigon  Mortuary  performed  a  number  of  autopsies  in  wound  pathology 
studies  apart  from  cases  studied  in  the  field.  When  this  is  taken  into 
consideration,  much  of  the  difference  between  the  true  distribution  and 
the  WDMET  distribution  is  removed. 

The  upper  half  of  Table  2  lists  the  types  and  frequency  of  occurrence 
of  body  armor  encountered  in  the  study  thus  far.  The  total  number  of 
casualties  who  were  wearing  body  armor  was  139;  those  known  not  to  be 
wearing  armor,  365.  This  gives  a  ratio  of  4  to  1  of  armor  not  worn  to 
armor  worn. 


The  number  of  hits  on  the  body  armor  and  helmet  is  shown  in  the 
upper  half  of  Table  3.  In  general,  one  would  expect  the  quantity  of 
hits  on  the  helmet  and  body  armor  to  be  large.  Out  of  139  casualties 
wearing  body  armor,  more  hits  should  have  been  on  the  armor.  Likewise, 
there  were  585  casualties  known  to  be  wearing  a  helmet;  this  is  802  of 
the  721  cases  which  contained  the  body  armor  set.  Using  Information 
compiled  from  the  analysis  program,  it  was  found  that  the  average  number 
of  hits  per  casualty  was  3.4.  Using  this  information,  the  following 
table  may  be  constructed: 
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B. A.  Helmet 

No.  Known  to  be  Wearing  Equipment  139  585 

Avg.  No.  Hits  Per  Casualty  3. 4  3.4 

Expected  No.  of  Hits  473  1989 

X  Body  Area  Covered  by  Protective  Gear  23X  7X 

Expected  No.  of  Hits  on  Equipment  109  139 

Actual  No.  of  Hits  on  Equipment  49  66 

Actual  No.  as  Percent  to  Expected  No.  45X  47Z 


From  these  calculations,  it  is  concluded  that  more  than  half  the 
armor  and  helmets  are  not  available  for  examination  or  that  hits  are 
not  noticed  on  the  equipment  examined  (e.g.,  in  a  helmet  which  is  badly 
battered  from  driving  tent  stakes  or  armor  with  worn  or  frayed  spots 
concealing  small  hits) . 

The  lower  half  of  the  table  shows  the  quantity  and  boot  type  worn. 
The  table  for  the  "no  boot"  category  suggests  that  during  the  shelling 
of  base  camps  many  troops  are  in  bed  or  relaxing  with  their  boots  off. 


III.  TWO-WAY  CORRELATIONS  OF  FACTORS.  In  addition  to  simple 
enumeration  of  the  various  factors  studied,  correlations  between  pairs 
of  factors  were  also  found,  an  example  of  which  is  presented  in  Table  4. 

The  correlation  is  between  wound  location  by  six  body  areas  and  activities 
accomplished  or  not  accomplished.  The  intent  in  gathering  this  informa¬ 
tion  was  to  explore  the  relationship  between  wound  location  and  incapacita¬ 
tion.  The  speed  with  which  the  casualties  are  evacuated  seldom  leaves 
the  soldier  time  to  attempt  any  task.  The  data  seems  to  show  that  the 
soldier  seldom  tries  something  he  cannot  do;  as  activities  accomplished 
outnumber  those  not  accomplished  by  over  nine  to  one. 

Another  two  factor  correlation,  weapon  versus  location  of  hit,  is 
contained  in  Table  5.  The  right  hand  column  is  the  total  of  the  row 
for  each  weapon.  To  circumvent  the  overwhelming  quantity  of  numbers, 
two  major  groups  of  weapons  were  extracted  to  make  the  last  two  rows. 

These  two  groups  will  be  referred  to  as  rifles  and  fragments  hereafter. 

To  further  simplify  matters,  the  wound  distribution  for  rifles  and 
fragments  is  transformed  into  percentages  in  Table  6. 

Table  6  also  (.hows  the  distribution  of  wounds  of  body  area  correlated 
with  a  number  of  ocher  factors.  The  first  column  shows  the  wound  distri¬ 
bution  compiled  by  accumulating  the  total  number  of  hits  in  a  body  area 
then  transforming  those  totals  into  percentages.  The  second  column  was 
compiled  by  accumulating  presence  of  a  hit  in  each  body  area  over  all 
casualties  then  transforming  these  totals  into  percentages.  For  instance, 
if  a  casualty  received  twelve  (12)  hits  in  the  head  or  thorax,  the 
sample  nunber  under  "total  number  of  hits"  would  be  increased  by  12; 
however,  the  sample  number  under  "presence  of  a  hit"  would  be  increased 
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by  2,  one  foi  nresenee  of  «  Mf-  -t*  heed  end  one  for  presence  cf  s 
hit  in  the  thorax.  As  is  evident  from  a  comparison  of  the  percentages 
in  these  two  columns,  there  is  no  great  difference  between  the  two 
methods  of  accumulating  the  wound  distribution.  Evidently,  the  body 
area  which  receives  the  most  hits  is  also  the  area  most  likely  to  be  hit 
(one  or  more  times) .  The  one  possible  exception  is  the  lower  extremity. 
When  it  is  hit,  the  lower  extremity  seems  to  get  more  hits  than  other 
areas.  This  could  be  because  the  lower  extremity  tends  to  have  more 
shielding  from  fragmenting  munitions  than  the  upper  parts  of  the  body, 
so  when  a  fragmenting  munition  does  detonate  near  enough  to  the  man 
that  the  legs  are  exposed,  the  probability  of  multiple  hits,  especially 
to  the  lower  extremity,  is  quite  high. 

The  next  pair  of  columns  was  derived  from  Table  5  where  the  numbers 
of  hits  by  rifle  bullets  and  fragments  have  been  converted  into  percent¬ 
ages.  The  percentages  for  fragmenting  munitions  are  almost  identical  to 
those  of  presence  of  a  hit  (by  any  weapon) ,  but  the  Increased  percentage 
of  hits  in  the  combined  head,  neck,  and  thorax  areas  for  rifles  might  well 
be  an  indication  of  aimed  fire. 

The  next  group  of  correlation  in  Table  6  shows  wound  location  versus 
three  categories  of  body  position,  upright,  "doubled-up,"  and  lying 
(which  is  90%  prone) .  Percentages  do  not  differ  enough,  column  to  column, 
to  be  highly  significant.  However,  in  each  case  the  small  difference  Is 
in  the  direction  which  one  would  expect.  For  instance,  in  moving  from 
the  upright  into  a  doubled-up  position  the  head,  neck,  thorax,  and  upper 
extremities  do  not  change  in  presented  area;  however,  the  lower  abdomen, 
pelvis,  and  lower  extremity  are  those  parts  which  are  doubled-up,  proving 
shielding  to  each  other,  and  thus  losing  presented  area.  The  relative 
percentages  of  hits  under  these  two  areas  reflect  these  observations. 

The  mean  presented  area  of  the  head  and  neck  to  horizontal  hits  is 
about  6.5%  of  the  total  presented  area  of  the  upright  man.  Why,  then, 
is  there  such  a  large  percentage  of  hits  on  the  head  and  neck  of  the  man 
in  an  upright  position?  Part  of  the  reason  has  already  been  mentioned: 
in  a  ground  burst  the  fragment  sprays  will  be  limited  by  earth,  irregulari¬ 
ties  in  the  surface  flora,  stones,  or  other  low-flying  cover,  so  that  there 
will  be  eome  angle  with  the  horizontal  below  which  few  or  no  fragments  will 
be  found.  Thus,  the  upper  parts  will  receive  more  hits,  on  the  average, 
than  their  mean  presented  area  warrants.  If,  on  the  other  hand,  the 
fragments  are  from  direct  fire  artillerv  or  proximity  fuse  munitions 
the  burst  is  considerably  above  the  ground.  In  this  case,  the  presented 
area  of  a  man  is  like  his  appearance  if  one  is  standing  on  a  building 
looking  down  at  him.  The  majority  of  his  presented  area  in  this  case 
is  head,  thorax  and  shoulders.  For  the  soldier  in  the  lying  position, 
cover  is  a  major  factor  in  wound  distribution.  More  cover  is  offered 
near  the  ground,  and  that  la  usually  the  reason  the  man  is  lying  there 
in  the  first  place,  to  take  advantage  of  whatever  cover  is  available. 
Therefore,  we  would  expect  that  with  this  highly  variable  factor,  the 
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wound  distribution  would  be  very  erratic,  which  It  is.  We  also  would 
expect  that  the  tendency  toward  greater  numbers  of  hits  in  the  higher 
parts  of  the  body  would  disappear,  which  it  does. 

The  last  three  columns  of  Table  6  correlates  injury  type,  fatal 
and  non-fatal  and  cause  of  death  to  body  area  wounded.  As  expected, 
for  the  WlA's  the  larger  proportion  of  hits  occur  on  the  loast  vulner¬ 
able  parts,  the  extremities;  while  for  the  KIA  and  DOW's  the  larger 
percentages  occur  in  the  head,  neck,  and  thorax.  The  last  line  in 
the  table  shows  that  the  average  casualty  received  3.50  hits  in  1.82 
body  areas.  Since  each  casualty  received  (on  the  average)  several 
wounds  in  two  body  parts,  the  distribution  of  wounds  in  the  KIA  and 
DOW's  does  not  show  the  true  distribution  of  cause  of  death.  For 
instance,  a  fatality  may  have  a  bullet  wound  of  the  leg,  but  it  was 
the  bullet  through  the  heart  that  killed  him.  When  only  the  wound 
causing  death  is  considered,  the  last  column  of  Table  6  results. 

In  Table  7,  the  rifle  bullet  and  fragment  wound  distributions  of 
Table  6  are  further  broken  down  Into  hits  on  the  front  or  back  of  the 
body.  The  purpose  therein  was  to  determine  if  hits  about  the  body  are 
truly  random  for  bullets  or  fragments.  The  differences  which  stand  out 
In  this  comparison  are  the  front  and  back  of  the  head  and  neck  for 
fragments  and  the  front  and  back  of  the  upper  extremity  for  bullets. 
Considering  the  latter  first,  we  note  that  the  body  diagrams  used  in 
this  study  consider  the  man  to  be  standing  In  the  standard  anatomical 
position;  l.e.,  with  the  palms  of  the  hands  facing  forward.  In  the 
battlefield,  the  soldier  can  be  envisioned  to  be  holding  his  arms  in 
almost  any  other  position  rather  than  the  standard  anatomical  position, 
resulting  in  considerable  ambiguity  In  what  is  the  front  and  what  1b 
the  back  of  the  arm.  As  a  matter  of  fact,  the  palms  are  usually  turned 
toward  the  body  with  the  result  that  a  large  part  of  what  is  called  the 
front  of  the  lower  arm  is  ususlly  shielded  from  being  hit  by  the  trunk. 

The  much  lower  incidence  of  fragment  hits  on  the  back  of  the  head 
and  neck,  mentioned  earlier,  Is  due  to  the  ability  of  the  helmet  to 
defeat  most  incoming  fragments  and  the  much  greater  shielding  that  the 
helmet  provides  the  back  of  the  head  and  neck.  When  the  effect  of 
helmet  protection  from  fragments  and  the  ambiguity  of  which  is  the 
front  or  the  back  of  the  arm  are  removed,  the  lower  part  of  Table  7 
follows.  Here  it  is  seen  that  there  is  an  Insignificantly  small  dif¬ 
ference  between  the  percentage  of  fragments  striking  the  front  or  back 
of  the  soldier.  For  bullets,  on  the  other  hand,  a  difference  was 
expected  because  of  the  highly  directional  nature  of  rifle  fire;  viz., 
in  a  fire  fight  the  troops  are  usually  facing  each  other  across  some 
kind  of  battle  front. 

Table  8  shows  the  number  of  Injuries  and  fatalities  due  to  enemy 
fire  as  a  function  of  weapon.  The  first  two  columns  contain  the  numbers 
of  fatal  and  non-fatal  cases  while  the  third  column  lists  the  ratio  for 
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the  two.  The  quantity  NF/F  will  be  referred  to  aa  the  "eurvival 
index."  Thus,  weapons  such  as  rifles  and  the  Claymore  mine  have  a 
low  survival  index  (or,  conversely,  a  high  fatality  rate).  The 
recoilless  rifle  and  large  land  mine  display  a  moderate  survival 
index  while  most  other  fragmenting  munitions  show  a  moderate  to  high 
survival  with  hand  grenade  and  artillery  showing  a  very  high  index. 

In  the  lower  part  of  the  table,  the  rifle  and  fragment  combination 
groups  are  listed. 

IV.  THREE-WAY  CORRELATION  OF  FACTORS.  Figures  1  through  6  all 
show  the  results  of  one  correlation  of  three  factors,  injury  type 
(KIA,  DOW,  WIA,  etc.),  weapon,  and  distance  between  the  casualty  and 
weapon  (or  detonation) .  Figure  1  displays  graphically  the  cumulative 
distribution  of  fragment  and  rifle  bullet  injuries  regardless  of  Injury 
type.  The  curves  show  that  fragmenting  munitions  are  much  shorter  range 
weapons  than  rifles,  as  would  be  expected.  To  quantitate  this,  note 
that  90%  of  the  fragment  wounds  occur  at  40  meters  or  less  whereas  901 
of  the  bullet  wounds  are  accumulated  only  at  160  meters.  Figures  2  and 
3  split  the  data  in  Figure  1  into  fatal  and  non-fatal  wounds.  Ninety 
percent  (90X)  of  the  fatal  fragment  wounds  occur  at  ranges  less  than 
30  meters.  Ninety  percent  (901)  of  fatal  bullet  wounds  occur  at  ranges 
of  12$  meters  or  less.  Figures  4  and  5  show  the  same  data  presented  in 
a  slightly  different  manner. 

Figure  6  shows  a  slightly  different  method  of  presenting  a  three- 
way  correlation.  The  factors  correlated  are  weapon,  range,  and  number 
of  hits  a  casualty  received.  Specifically,  the  curve  in  Figure  6  is 
the  cumulative  distribution  of  number  of  hits  for  fragments.  The 
family  of  curves  represents  categories  of  range.  Similar  data  was 
obtained  for  bullets,  but  the  sample  sizeB  were  too  small  to  give  a 
coherent  form  to  the  curves.  These  last  five  curves  indicate  the 
difficulty  one  has  in  displaying  multiple  correlations  even  when  the 
weapons  were  combined  into  only  two  categories.  Presenting  a  full 
four-way  correlation  in  a  reasonable  space  borders  on  the  impossible. 

In  the  future,  as  more  cases  are  entered  on  punch  cards,  a  similar 
"analysis"  will  be  conducted  using  more  factors  and  more  extensive 
correlations.  It  is  hoped  that  sample  sizes  will  be  large  enough,  and 
definite  enough,  to  clear  up  the  inconclusiveness  in  some  of  the  data 
presented  here. 
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Wound  Distribution;  Percentage  of  Hits  hr  the  Various  Body  Areas 

1  Correlations  with  Presence  of  Hit: 
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Wound  Distribution:  Presence  in  Various  Body  Area  Versus  Weapon 
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No.  Wounded  as  a  Result  of  Hostile  Action 
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EFFECT  OF  NUMBER  OF  OBSERVING  STATIONS 
ON  FLIGHT  MEASUREMENT  PRECISION 


Fred  S.  Henson 
National  Range  Operations 
White  Sands  Mimile  Range,  New  Mexico 

ABSTRACT.  Estimation  of  the  standard  error  of  a  measured  space- 
position  is  reviewed.  Pooling  such  standard  deviations  for  the  portion  of  a 
trajectory  covered  by  a  given  measuring  system  -  and  for  a  series  of  tests  on 
the  same  missile  •  is  discussed.  Results  are  presented  showing  the  dependence 
of  average  position-precision  on  number  of  stations  used  in  the  solution.  The 
correlation  of  these  variables  in  operating  data  is  dominant  and  the 
magnitude  of  the  effect  is  profound.  The  exponential  improvement  of 
position-precision  by  increasing  stations  can  be  as  much  as  four  times  the 
effect  of  increased  sample-size  on  the  standard-error-of-the-mean  of  a  normal 
distribution.  Mechanisms  considered  embrace:  geometric  convergence, 
observational  constraints,  methodological  deficiencies,  and  statistical  con¬ 
siderations.  The  exponential  dependence  of  position-precision  on  number  of 
cinetheodolites  may  be  an  index  of  the  measurability  of  the  object 
(‘readability’  of  its  point-of-neference).  Statistical  measures-of-goodness  of 
geometric  convergence  are  derived.  A  procedure  is  suggested  for  rating 
test-configurations.  It  is  shown  that  calculating  observationally-redundant 
precision  of  nonredundant  solutions  is  a  generalization  of  the  classical 
calculation  of  the  precision  of  single  observations  from  the  precision-of-the- 
mean  of  a  sample  of  a  given  size.  A  need  is  suggested  for  a  statistics  of 
observations  wh<  -h  define  geometric  surfaces  in  space.  (This  may  be  a 
generalization  of  numerical  statistics.)  Results  are  also  presented  showing  the 
dependence  of  precisions  of  derived  velocity  and  acceleration  on  number  of 
stations.  A  probabilistic  improvement  of  physical  accuracy  (bias)  by 
increasing  stations  in  night-measurement  is  hypothesized.  A  summary  is 
appended. 


This  paper  has  been  reproduced  photographically  from  the 
manuscript  submitted  by  the  author. 
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INTRODUCTION.  Out  review « of  over-  end  under-meeting  of  quality  requirement!  made  it  necetnry 
to  investigate  the  relationship  between  quality  and  retourcei  in  our  data-support  operation. 

We  had  been  aware  of  the  statistical  improvement  of  precision  in  which  ordinary  averages  bunch  closer 
together  in  proportion  to  the  square  root  of  the  number  averaged. 

Much  of  the  following  work  was  published  in  internal  memoranda  during  the  fall  of  1967. 

BACKGROUND.  This  paper  is  clinical  in  the  sense  that  it  is  exploratory. 

Since  January  1963,  White  Sands  Missile  Range  hat  built  a  sufficient,  standard  basis  fot  a 
date-precision  spec  into  in  user-document  format  -  in  the  Interest  of  comparability.  (We  say  what  we  mean 
by  our  numbers,  and  that  we  assume  the  user's  numbers  mean  the  same  thing  -  unless  he  makes  it  very  clear 
otherwise.)  WSMR-ttandard  precision  it  the  average  standard-error  of  component  values  of  data,  obtained 
by  propagation  from  the  previous  stage  of  the  collection-reduction  procew.  This  precision-index  can  be  tied 
back  directly  to  station  quality,  and  film-reader  quality.  It  applies  to  the  data  in  the  form  in  which  it  it 
reported. 

Our  data-predsion  it  many  things.  It’s  the  radius  of  confusion  of  a  data  value,  due  to  the  disagreement 
among  the  stations.  It's  how  well  we  can  know  from  the  observations  what  the  value  is.  When  related  to  a 
valid  requirement,  it’s  a  measure  of  Range  effectiveness.  Precision  it  available  in  current  operation,  it 
affords  a  meant  for  operational  control,  and  suffices  for  tome  of  the  user’s  needs.  Apparently,  In 
flight-measurement,  optimizing  system  predsion  tends  to  optimize  system  accuracy.  We  furnish  our  users 
the  predsion  of  etch  data  value.  And,  we  uae  root-mean -square  average  predtions  -  by  segment,  by  tdst,  by 
month,  and  by  program  •  for  operating  and  management  control. 

I  gave  a  historical  and  exploratory  paper  on  data-support  quality  control  three  years  ago  at  the  Design 
of  Experiments  Conference  (Ref.  1).  Our  monthly  Data  Quality  reports  (Ref.  2)  give  actual  average 
precisions  -  by  station,  by  measuring  system,  and  by  missile  •  along  with  the  requirements  and 
commitments.  Averages  are  monthly,  and  cumulative  for  the  fiscal  year.  Being  definite  and  quantitative, 
keeping  usable  scores  on  data  quality,  and  controlling  dotely  on  the  basis  of  results  are  a  bit  of  a  departure 
from  missile-range  tradition.  Personally,  I  feel  it  is  to  the  advmtagt  of  mission  personnel  to  provide 
management  quantitative  bates  for  decisions. 

ESTIMATION  OF  PRECISION.  Figure  1  is  a  summary  of  the  math  we  use  to  calculate  precision  of 
observed  position  for  dnetheodolltts. 

This  is  from  our  Data  Reduction  Handbook  •  ‘R.  C.  Davis'  method  (Ref.  3).  We  solve  for  position  by 
minimizing  the  tum-of-aquares  of  the  deviations  of  the  stations  in  azimuth  and  elevation,  from  their 
leaat-aquaret  point.  In  the  first  equation,  cot  allows  for  the  fact  azimuth  circles  get  smaller  at  one  goes  up 
•  until  'the  universe  comes  to  a  point  directly  over  each  of  our  stations'.  As  the  azimuth  circles  get  small  we 
lose  resolution;  the  azimuth  error  becomes  something  between  ungodly  sndunknown.  So,  we  temper  it  by 
the  coaine  of  the  elevation  angle 1 K  We  average  the  angular  deviations  st  their  squares.  The  square  root  of 


1  Reference  14  it  a  formal  explanation  of  this  correction. 
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that  (oA )  is  our  measure  of  the  average  disagreement  among  the  angular  observations.  Use  of  the 
3-dimcnsiunal  degrees-of-freedom  (2N-3)  yields  an  estimate  of  the  population  standard  deviation  of  angular 
observations  (of  vvliat  it  would  be  if  we  could  repeat  them  Indefinitely). 

Now.  wc  least-squares  in  3  dimensions.  But,  forever  after  we  treat  the  components  separately;  as 
though  they  had  been  independently  determined  (we'll  come  back  to  that).  Multiplying  the  variance  of  the 
angular  deviations  by  the  cofactor  of  the  proper  element  of  the  (principal)  diagonal  of  the  determinant  of 
the  least-squares  matrix,  then  dividing  by  the  (value  of  the)  determinant  1a  said  (by  a  liberal  interpretation) 
to  transform  the  angular  variance  to  a  linear-component  variance  of  the  least-squares  position  mean.  It  is 
planned  to  verify  this  last  point  by:  converting  angular  residuals  to  their  linear-component  equivalents; 
calculating  standard-deviation  aid  standard-error-of-mean  of  each  set  of  these  (about  the  least-squares 
mean):  noting  which  is  closer  to  the  matrix  result. 

Wc  finally  multiply  each  standard  deviation  by  the  proper  value  of  the  ‘t’  statistic,  to  correct  for  the 
relatively  small  departure  from  normality  at  the  68.3%  probability  level  due  to  the  small  sample-size  alone 
(to  the  small  degrees-of-freedom).  Additionally,  we  usually  obtain  a  precision  of  smoothed  position  by 
dividing  the  standard-deviation  of  a  series  of  points,  about  its  2nd-degree  fit,  by  the  proper  reduction-factor 
in  terms  of  the  number-of-points.  (We  are  not  yet  incorporating  lack-of-fit  into  .smoothed-potition 
precision.)  Our  precisions  of  velocity  and  acceleration  are  obtained  by  propagating  smoothed-potition 
precision  thru  the  1st  and  2nd  derivatives  of  smoothed  position. 

AVERAGING  PRECISION.  It  is  physically  necessary  to  describe  measurement  quality  in  terms  of 
frequency  distributions.  It  is  operationally  and  managerial^  necessary  to  describe  data  quality  in  job  lots 
(segments  or  tests)  -  and  wholesale  (series-of-tests).  Since  January  1963,  WSMR  has  officially  defined  data 
quality  as  the  average  precision  for  the  firings  covered  by  the  documentation. 

When, a  requirement  or  commitment  is  met  afiRns  •average,  approximately  68.3%  of  the  data  values  fall 
within  the  stated  tolerance  of  their  statistically-true  valuea.  (i.e.,  When  compliance  of  the  individual 
standard  deviations  is  50%,  average  compliance  of  the  data  values  which  they  characterize  approximates 

68.3%.) 


In  root-mean -squaring  a  component  precision  for  a  segment  or  a  test,  our  denominator  is  tire  number 
of  component  values.  Then,  our  test-average  quality  is  the  root-mean -square  of  the  3  test-average 
component  precisions.  (Yielding  the  radius  of  that  sphere  which  conventionally  approximates  the  average 
error-ellipsoid.)  In  our  monthly  and  cumulative  project  averages,  tests  are  given  equal  weight. 

What  constitutes  a  statistical  population  is  an  operational  decision.  Our  average  precisions  are 
calculated  by  the  same  procedures  each  time:  so  they  have  ar.  operational  validity  (we're  not  in  the  rigor 
business).  Wc  are  interested  in  knowing  the  magnitude  and  direction  of  the  errors  resulting  from  our 
nonivnrytower  applications  of  ivorytower  methods. 

One  purpose  of  statistics  is  to  numerically  characterize  errors.  This  paper  suggests  that  includes 
numerically  characterizing  the  errors  incurred  in  applying  statistics  to  the  real  world. 

Let's  look  at  some  operational  findings. 
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EFFECT  OF  NUMBER  OF  STATIONS.  In  thii  investigation,  teat-average  precisions  were  sorted  by 
average-number-of-stations  (to  the  nearest  integer)  in  the  solution.  Each  group  of  precisions  was  then 
root-mean-squared;  the  denominator  being  the  number  of  rounds. 

Figure  2  shows  average  precision  of  position  of  Navy  bombs  vs  average  number  of  Askania 
(cinctheodolite)  stations  computed  (lan-Jul  1967).  The  horizontal  ticks  are  the  plotted  points.  Numbers  on 
the  graph  are  the  number  of  rounds  (tests)  represented  by  each  plotted  point.  The  number  of 
position-points  in  one  of  our  tests  varies  widely.  Typically,  it  is  a  few  hundred,  times  3  components.  The 
Navy  bomb-drop  is  a  highly  diversified  program.  In  a  manner  of  speaking,  the  range  and  user  'did  their 
worst';  but  the  average  quality  depended  on  only  one  variable  (except  for  the  limited  5-station  data).  That 
data  is  shown  both  as  is,  and  after  deleting  the  worst  round.  It  may  indicate  they  were  running  out  of 
reasonably  well-located  stations  for  covering  the  bomb,  which  impacts  the  ground. 

GENERALIZATION  OF  MODEL.  Please  turn  to  Figure  3.  This  merely  looks  hard.  It's  only  one 
equation  (equation  (S))  •  transformed  (equation  (1»  -  and  generalized  (in  both  forms).  If  our  only  effect 
were  numerical  redundancy,  portion  quality  would  improve  in  proportion  to  the  square-root  of  the 
number-of-observations  (Ref.  4).  Or,  to  the  square-root  of  the  number-of-ttations,  since  the  number-of- 
observations  per  station  is  constant  for  a  system.  Equation  (5)  it  the  basic  form,  for  optics,  of  this  classical 
theorem  (for  *  normally  distributed).  For  radar  the  2  becomes  3;  for  DOVAP  It  becomes  1. 

When  the  only  effect  is  number  of  observations,  the  population  standard  deviation  of  individual 
observations  (ox)  it  of  course  unchanged.  So,  if  equation  (5)  is  used  twice  in  constructing  a  curve,  it  reduces 
to  its  working  form,  equation  (1).  You  might  guess  that  (6),  (7),  (8)  and  (2),  (3),  (4)  were  empirically 
derived  by  generalizing  the  exponent  in  (5)  and  in  (1).  Equations  (5),  (6),  (7),  and  (8)  are  hyperbolas  of  the 
respective  types; 


y  ■  k,  xy  ■  k,  x3  a  y  ■  k,  xJ  y  ■  k. 

In  Figure  4,  the  operational  curve  is  the  solid  one.  The  working  form  of  the  classical-statistical 
equation  was  used  both  ways  from  a  midpoint  to  construct  the  curve  shown  as  a  string-of-beads.  This  didn't 
do  the  job  so  we  had  to  look  further.  The  upper  half  of  the  Navy-bomb  curve  is  closest  to  Improvement  of 
precision  as  N3/1.  The  lower  half  is  closest  to  Na.  Overall,  it's  a  tossup  between  those  two.  Physical 
interpretations  of  these  3  hyperbolas  are  possible  at  each  ^-station,  for  optics.  At  l-'A  stations,  the  curves 
represent  standard  deviations  of  single  observations  of  a  point-in-3-dimensional-space.  The  interpretations 
of  infmitely-poor  precision  for  zero  stations  and  of  infinite  stations  for  perfect  precision  are  obvious. 

VARIABILITY  AT  A  GIVEN  NUMBER  OF  STATIONS.  Curvilinear  correlation  coefficients  might  be 
calculated  for  equation  (7)  applied  to  the  upper  half  of  the  Nayy-bomb  data  and  for  equation  (8)  applied  to 
the  lower  half.  Differences  between  these  and  unity  would  estimate  the  relative  influence  of  all  factors 
other  than  number-of-stations.  Further,  the  standard  deviations  of  individual-round  precisions  about  the 
pooled  values,  at  each  number  of  stations,  could  be  calculated  from  the  round-average  data.  Or,  they  could 
be  calculated  about  the  corresponding  points  on  the  fitted  curves.  These  sigmas  could  be  used  to  set  current 
individual-round  tolerances  for  the  models,  at  each  number  of  stations  (it  is  planned  to  use  this  approach). 


The  corresponding  tolerances  of  pooled  (cumulative)  precisions  could  be  estimated  as  the  appropriate 
standard  deviation  of  individual  rounds  divided  by  the  square  root  of  the  number  in  the  average.  In 
calculating  correlation  of  round-average  precision,  or  in  controlling  a  cumulative  average,  the  average 
number  of  stations  should  be  carried  to  the  first  decimal  place  (it  is  avtikble  to  4  places). 

Dr.  H.  H.  Germond  suggests  plotting  average  precision  vs  number  of  stations  on  log-log  paper.  That,  in 
this  way.  the  data  may  be  fitted  with  straight  lines  whose  slopes  are  negatives  of  the  corresponding  powers 
of  N  of  the  hyperbolas  (Ref.  5).  Straight  lines  fit  people  bettei.  This  investigation  has  emphasized  direct 
study  of  the  relationships.  Linear  correlation  coefficients  would  of  course  apply  to  log  precision  vs  log 
number-of-stations.  rather  than  to  precision  vs  number-of-stations. 

FURTHER  DATA.  Figure  5  shows  average  precision  of  position  of  Navy  aircraft  vs  average  number  of 
Askanta  stations  computed  (Jan-Jul  '67).  The  curve  is  less  steep  than  for  the  bombs.  There  it  no  clear 
indication  they  were  running  out  of  stations  well-located  for  covering  the  aircraft.  Figure  6  shows  that,  for 
the  aircraft,  precision  came  closest  to  improving  in  direct  proportion  to  number-of-stations. 

Mr.  Frank  Hemingway  suggested  we  look  at  the  vertical  component  separately.  In  Figure  7,  the  dashed 
curve  is  the  average  quality  of  Navy  aircraft  x,  y,  and  z  from  Figure  5.  The  solid  curve  is  the  precision  of  the 
vertical  component  only.  From  inspection  of  other  WSMR  cinetheodolite  data,  this  better  precision  of  the 
vertical  component  appears  to  be  a  general  result.  Without  i  in  the  composite,  the  difference  would  be  half 
as  much  again.  Please  note  that  these  are  parallel,  except  that  the  z -curve  is  a  bit  flatter  near  the  right-hand 
end. 

Figure  8  shows  the  effect  of  number-of-stations  on  Redeye  Contraves  (cinetheodolite)  precision-of- 
position  was  dominant  and  profound,  even  with  less  data  (Jan-May  ’67)  than  on  the  Navy  tests.  In  Figure  9, 
the  upper  part  of  the  Redeye  curve  was  about  a  tossup  between  N3/J  and  N*.  The  lower  part  was  N1. 1 
gave  N1  a  little  edge,  overall. 

Figure  10  shows  the  precision  curve  was  slightly  less  steep  for  the  Redeye  target  (Jan-May  ’67).  In 
Figure  II .  this  relationship  was  a  tossup  between  N  and  N 3  '3  for  the  upper  half,  N3/J  lower  and  overall. 

Figure  12  shows  precision  of  Askania  position-measurement  on  the  PEARL  aircraft  radome  (Jan-Jul 
’67)  improved  approximately  as  N'*,  if  we  lightly  regard  the  poorer  4-station  data.  If  we  delete  the  worst 
rounds,  as  indicated  by  Figure  13,  the  3-station  data  was  better  than  the  trend  of  the  rest;  the  4-station  data 
was  on  the  curve;  the  upper  half  improved  only  as  N1  ,J . 

In  Figure  14,  if  we  didn't  take  our  result  too  seriously,  we  might  approximate  this  Redeye 
fixed-camera  data  (Jan-May  '67)  by  the  solid  curve  •  which  turned  out  to  be  nearest  to  improvement  of 
precision  in  direct  proportion  to  number-of-stations. 

Figure  15  shows  a  similar  situation  held  for  this  limited  DOVAP  data  on  Lance  (Oct  ’66  •  June  ’67). 
Because  of  the  very  small  amount  of  data,  it  appeared  desirable  to  also  look  at  2  of  the  averages  on  the 
assumption  that  they  might  not  be  representative  samples.  In  Figure  16,  our  solid  approximation  turned 
out  to  fall  closest  to  improvement  of  precision  in  direct  proportion  to  number-of-stations. 
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Prelim  inary  indications  were  that  ilie  avciage  standard  deviation  of  least  squares  mean  space  positions 
determined  by  FPS-16  radar  typically  improved  as  the  3/2  power  of  the  number-of-stations. 

Apparently,  out  cumulative-average  component-precision  of  observed  position  converges  rapidly  to 
correlation  with  number  of  stations. 

It  was  noted  from  plotting  the  foregoing  data-eurves  on  the  same  graph  that  the  exponential  rate  of 
proportional  (percentage)  improvement  of  precision,  with  increase  in  number-of-stations,  depends  on  the 
magnitude  of  the  precision  values  as  well  as  on  the  steepness  of  the  curve.  Smaller  numerical  values  (bcttci 
precision)  require  less  numerical  improvement  (less  steepness)  for  a  given  exponential  rate  of  proportional 
improvement. 

VALUE  OF  APPROACH.  Findings  of  this  paper  are  being  used  in  management  reviews  to  express 
over-  and  undermceting  of  requirements  in  terms  of  resources  (resource-equivalent  ratios  of  precisions).  The 
foregoing  data-curves  provide  specific  resource-capability  relationships  by  project.  These  are  relevant  to 
daia-support  committing,  control,  and  planning.  The  data  will  lend  itself  to  further  structuring  of  our 
capabilities  by  graphing  in  various  ways.  Also,  to  advancing  the  state-of-the-art  and  the  state-of-the- 
understanding  -  as  the  following  pages  indicate. 

SUMMARY  FOR  CINETHEODOLITES.  Figure  17  is  a  summary  for  Askanlu  and  Conlraves 
position-precision,  based  on  all  the  cine-position  data  I  have  plotted  to  date. 

The  upper  half  of  this  table  shows  the  (approximate)  spectrum  of  dependence-of-precision  on  various 
powers  of  N  which  was  produced  by  the  interaction  of  cinetheodolitc  systems  with  various  night-measure¬ 
ment  tasks.  Relative  point-of-refercncc  difficulty  may  explain  the  broad  precision-response  spectrum  of  the 
cines.  The  PEARL  radar  pod  is  a  large,  black  hemisphere  without  distinguishable  markings.  It  is  more 
difficult  to  establish  a  consistent  reference  on  large  aircraft  than  on  small,  and  mure  difficult  on  small 
aircraft  than  on  small  or  medium  missiles  or  bombs.  The  exponential  dependence  Is  apparently  an  index  of 
the  measurability  of  an  object. 

Our  management  reviews  show  the  factor  of  over-  or  undermeeting  of  requirements  in  terms  of 
resources  (stations)  -  where  this  is  not  identical  with  effective  ratio  of  average-precision  to  its  requirement. 
The  proportionalities  of  numbers  of  stations  to  precision  are,  of  course,  the  Inverses  of  the  upper  table  in 
Fig.  17  ■  as  shown  in  the  lower  table.  We  use  relationships  specific  to  projects  where  they  arc  available.  If 
dependence  must  be  obtained  from  the  table,  we  show  the  resource-equivalent  ratio  in  parentheses. 

Now,  it's  easy  to  say  precision  can  improve  in  proportion  to  as  much  as  N1  in  flight-measurement 
because  it's  a  3-dimensional  process.  But  what  are  the  mechanisms  by  which  this  takes  place  -  and  what  is 
the  relative  importance  of  each?  We  have  turned  over  a  few  stones,  and  here  is  a  list: 

MECHANISMS. 

1.  The  decrease  through  increase  in  sample-size  alone  of  our  uncertainty  as  to  what  the 
observations  say  the  data  value  would  he  if  we  could  repeat  the  same  measuring  process  indefinitely.  This  is 
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the  well-known  statistical  convergence  (Ref.  4).  It  is  generally  accepted,  locally,  that  our  'Davis'  method  for 
cine  data  effectively  takes  this  into  account. 

So.  oc  =  f  (Nffl).  where  m  =  f  (sample  size; 

Let's  call  this  one  improvement  by  overcoming  amail  sample -size. 

2.  Improvement,  through  increase  in  sample-size,  of  the  average  goodness  of  the  intersection-angles 
of  the  lines-of-sight  from  the  stations  (less  chance  of  only  bad  intersections).  The  relationship  of 
intersection-angle  to  both  precision  and  accuracy  was  treated  in  my  earlier  clinical  paper  (Ref.  6).  The 
angie-of-intersection  mechanism  has  some  diminishing  returns  as  the  useful  cones  (with  vertex  at  the 
missile)  become  divided  into  smaller,  less  desirable  intersection-angles.  In  multi-station  measurement,  the 
projected  intersections  of  each  station-line  with  each  of  the  others  are  relevant  to  linear  precision. 

Simulations,  informally  communicated  by  Mr.  W.  V,  Hereford  of  Sandia  Corp.  (Ref.  7),  showed  a 
first-power  Improvement  in  rms  position-error  while  overcoming  his  ‘worst-case’  geometries;  only  a 
half-power  improvement  while  interacting  with  his  ‘best -case’  geometries.  (Gradual  expanding  of  a  narrow 
baseline  vs  gradual  spacing-in  of  a  wide  one.)  This  establishes  art  effect  of  geometry  on  the  power  of  the 
precision-response. 

Multidimensional  measurement  depends  on  geometric  convergence  at  well  as  on  numerical 
convergence. 

Figure  18  shows  an  acute  convergence-angle  (smaller  than  the  90°  convergence  which  Reference  6 
deduced  to  be  the  general  optimum).  Figure  18  can  be  any  plane  through  2  stations  and  the  missile.  Let  the 
±  angular  dispersion  about  the  direction  lines  be  an  average  angular  standard-deviation  (oA). 

The  diagonals  of  the  smaller  almost-dlamond  are  its  linear-standard-deviation  subtenders  of  0A  in  the 
directions  perpendicular  to  and  parallel  to  the  baseline. 

Solving  either  error-triangle,  SMO,  which  contains  the  vertical  diagonal  (given  the  angles  and  the  slant 
range): 


r  sin  a. 


r  sin  a. 


1  sin  (180°  -  0/2  ~  oA )  sin  (0/2  +  oA ) 
The  horizontal  diagonal  may  be  obtained  from: 


cot  8/2 


7 1/7 

H/i 


rsin  oA 
sin  0/2 
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whence: 


r  sin  o. 


sin  9/ 1  cot  On 
where  d  is  the  parallel  convergence-angle. 


r  sin  «A 
cos  O/i 


I  sill  ft. 
s 

>  ^ m 

sin  <t>! 1 


cij  may  be  taken  as  a  measure  of  the  badness  of  the  convergence  for  measuring  t^rttniilbiihn  in  *l.e 
baseline  Ojj  may  be  taken  as  a  measure  of  ihe  badness  of  the  convergence  for  mcasmina  tmrnlh  l  im  ib- 
baseline.  (They  are  inverse  measures  of  the  goodness  of  the  convergence  for  these  purposes  | 


II  figure  18  is  a  vertical  plane  parallel  to  the  baseline,  a ^  measures  the  goodness  of  the  vertical 
projection  of  the  convergence  (in  terma  of  the  other  projected  quantities)  for  meaiuring  the  i  coordina'e  >f 
the  mtssile.  This  measurement  depends  only  on  the  elevation  readings.  Here,  measures  ar  (of  linear 
observations  not  our  standard  precision  of  the  lesit-iqusres  mean).  Its  units  are  those  of  Ihe  slant 
range. 

Per  the  first  of  the  above  3  equations: 

Vertical  convergence  Precision  and  Variance  Factors 


Vertical 

i 

feV) 

Convergence  (0) 

abt  8/1 

180" 

1.00 

1.00 

135" 

1.08 

1.17 

90 

1.41 

2.00 

45" 

2.61 

6.82 

15" 

766 

58.6 

5° 

22.9 

524. 

1" 

114.4 

13,120. 

This  table  compares  the  goodness  of  vertical  convergence  si  any  constant  value  of  projected  slant  range, 
which  avoids  confusing  the  effect  of  range  on  linear  precision  with  the  effect  of  convergence-angle.  It 
indicates  that  the  optimum  vertical  convergence,  per  se,  is  not  90°.  That  the  most  precise  measurement  of 
the  vertical,  as  far  as  geometry  goes,  is  when  the  object  is/rt  the  llne-of-sight  between  the  stations. 


Vertical  convergence.  In  fig.  18,  can  be  easily  calculated  for  a  given  case  as: 


0  =  2  arc  tun 


__baselin^ 
2  (altitude) 


The  keys  to  good  vertical  convergence  are,  of  course,  long  baselines. 


1 The  mode  of  is  defined  by  the  mode  of  oA  (c.g.,  observations,  mean,  series,  curve,  etc.). 
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If  Figure  18  is  a  horizontal  plane,  measures  (he  goodness  of  the  horizontal  projection  of  the  t 

convergence  for  measuring  perpendicular  to  the  baseline,  and  0||  measures  the  goodnen  of  the  horizontal 
projection  of  the  convergence  for  measuring  parallel  to  the  baseline.  (When  the  baseline  is  east-west,  Oj_ 
measures  ox  and  0||  measures  <y)  Since  the  two  horizontal  measurements  share  their  ptane  of  projection,  a  § 

gain  in  convergence  for  one  is  a  lota  in  the  other.  In  general,  it'a  not  sound  practice  to  improve  data  in  one  s 

coordinate  by  making  it  worse  in  another.  So,  the  practical  optimum  horisontal  convergence  Is  90°.  -4 

Ite 

‘ffi 

The  separate  formulas  for  and  are  of  relative  value  for  horizontal  measurement.  f 

Otherwise,  by  the  second  of  the  above  4  equations: 


Horizontal  Convergence  Precision  And  Variance  Disparity 


Perpendicular 

°L 

\ 

I 

Convergence 

ffil 

(t) 

:3 

-;t 

179° 

1/114.6 

1/13,140 

A 

175° 

1/22,9 

1/525 

.V. 

165° 

1/7.60 

1/57.6 

i 

■f 

135° 

1/2.41 

1/5.83 

120° 

1/1.73 

1/3.00 

90° 

1.00 

1.00 

1 

60” 

1.73 

3.00 

45° 

2.41 

5.83 

15° 

7.60 

57.6 

5° 

22.9 

525 

1° 

114,6 

13,140 

This  table  compares  the  disparity  of  the  two  horizontal  convergences  at  any  given  ground  range.  The  ratios 
of  precisions  approximate  the  ratios  of  width/length  (or  length/width)  of  the  actual  horizontal 
linearmbservation-error  ellipses,  The  ratios  of  variances  approximate  the  ratios  of  areas  of  circles  whose 
dameters  ore  the  two  diameters  of  the  ellipses.  (These  ratios  would  be  the  same  for  the  least-squares  means 
as  for  their  linear  observations.) 


Perpendicular  convergence  in  Fig.  18  is,  of  course,  calculated  for  a  given  case  as: 


6  -  2  arc  tan 


baseline 

2  (perpendicular  distance) 


One  key  to  optimum  horizontal  convergence  is  the  set  of  optimum  configurations  in  Figure  19,  which 
were  demonstrated  in  .  Reference  6.  The  least  that  should  be  done  is  to  compensate  a  narrow  convergence  in 
any  horizontal  direction  with  another  that  is  roughly  perpendicular  to  it. 

The  numbers  in  the  above  2  tables  are  similar,  but  only  the  first  table  ia  a  direct  measure  of  loss.  The 
second  is  the  ratio  of  loss  in  one  direction  to  lass  in  the  direction  at  right  angles. 
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A  direct  measure  of  the  combined  goodness  of  horizontal  convergences  is  the  root-mean-square 
average  of  their  perpendicular  and  parallel  precisions. 

Per  the  first  and  third  of  the  above  5  equations: 


Horizontal  Average  Precision  And  Variance  Factors 


81.0  6561 

16.2  263 


5.46 

29.8 

2.00 

4.0 

1.41 

2.0 

2.00 

4.0 

5.46 

29.8 

16.2 

263 

81.0 

6561 

This  table  compares,  the  goodness  of  horizontal  convergences,  at  any  given  ground  range,  in  terms  of  the 
0erage  quality  of  horizontal-component  data.  It  confirms  the  90°  optimum.  Average  horizontal  quality  Is 
somewhat  less  affected  than  vertical  quality  by  a  given  narrowness  of  convergence,  but  It  enters  twice  into 
data  quality.  Comparing. twice  the  variance  in  this  table  with  the  variance  in  the  first  of  the  above  tables,  it 
is  seen  to  be  equally  important  to  avoid  narrowing  h  Jrl/.on taljyid  vertical  convergences  In  the  region  from 
45°  to  0°;  Over  convergences  from  45°  to  90°,  the*vtflw  goes  from  being  equally  Important  to  being 
twice  as  Important. 

The  case  of  the  missile  in  the  plane  normal  to  a  2-station  baseline  at  its  midpoint  (Figure  18)  was 
picked  to  simplify  the  math  •  In  the  Interest  of  physical  understanding. 


I  think  we  can  say  our  net  station  configuration  isos  much  a  chance  proposition  as  our  net  number  of 
stations.  Of  course,  our  results  reflect  our  average  station-configurations. 


The  above  analytical  approach  lias  demonstrated  ample  potential  for  improving  position-precision  by 
improving  the  average  goodness  of  intersection-angles,  through  increase  in  sample-size.  (Toward  a 
happy-medium  convergence.)  For  the  z  coordinate,  the  first  of  the  above  tables  indicates  the  direct  effect 
of  average  convergence-angle  on  precision.  For  x  and  y,  the  last  of  the  above  tables  indicates  the  direct 
effect  of  average  convergence-angle  on  their  average  precision, (2  optical  stations  have  i  convergence,  3 
stations  3, 4  stations  6,  etc.). 
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The  'readability'  mechanism  speculated  under  SUMMARY  FOR  CINETHEODOLITLS  would  differ 
from  the  others  in  this  paper  in  being  a  degradation  of  gains  that  would  otherwise  be  made.  1  am  inclined, 
now,  to  explain  our  cine  precision  spectrum  as  due  to  readability  and/or  to  increasingly  vertical  trajectories 
in  going  from  left  to  right  in  the  upper  half  of  Figure  1 7. 

So,  bc  =  f  (Nm),  where  in  =  f  (  :  test  configuration; 

Let's  call  this  mechanism  improvement  by  overcoming  non  optimum  test  configuration. 

Another  possibility  is  that  limitations  of  our  least-squares  methods  are  being  averaged-uut  by 
(sheer)  number  of  stations.  I  have  attempted  to  state  whBt  might  be  called  practical  theorems  about 
optimum  least-squares  (Ref.  8).  These  are  given  in  Figure  20. 

(1)  Statistical  optimum  (minimizing  variation  among  Individual  observations).  WSMR’s  Davis 
methods  for  clnes  and  for  DOVAP  (Ref,  9)  are  optimum  by  this  criterion.  ItsOdle  and  Bodwell  (Ref.  10) 
cine  methods,  used  to  some  extent  in  the  past,  crippled  their  own  ability  to  estimate  statiatically-tnie 
population-means  and  population-variances,  by  deterministically  cutting  their  sample-sizes  in  half 
(least-squaring  deviations  of  direction-lines).  Until  recently,  WSMR's  multistation-radar  method  determin¬ 
istically  cut  its  sample-sizes  to  one-third  (by  least-squaring  deviations  of  1-station  solutions).  Our  new 
multistation-radar  method  Is  optimum  by  this  criterion.  Our  Davis  cine  method  is  not  statlatlcally-optlmum 
In  estimation -of-quality,  because  It  propagates  an  average  angular-precision. 

(2)  General  optimum  (transforming  residuals  from  station-variable  to  missile-variable  before 
optimizing).  Our  former  Odle  and  Bodwell  cine  methods  and  our  recent  radar  method  were  optimum  by 
this  criterion,  In  the  sense  that  they  optimized  linear  deviations  of  'observations'.  This  paper  suggests  that 
our  Davis  cine  method  is  not  optimum  by  this  criterion,  because  It  optimizes  only  similarity  at  the  stations  • 
not  overall  congruence;  that  is,  it  treats  stations  equally  regardless  of  their  slant-ranges  from  the  missile  and 
of  thb  convergences  of  their  llnes-of-sight  with  those  of  the  other  stations.  It  optimizes  angular  quality  of 
the  stations;  it  docs  not  optimize  linear-position  quality  of  the  missile.  (Some  work  has  been  done  at  WSMR 
toward  a  linear  least-squares  method  for  cinetheodolites.) 

(3)  Quality  optimum  (avoiding  the  probable  loss  of  accuracy  inherent  in  geometric-averaging  of 
angular  observations).  Per  right  triangles:  If  2  azimuth  planes  both  miss  a  least-squares  position  solution  (In 
general  they  will),  their  Intersection  will  miss  it  farther  thun  either  plane  (hypotenuse  vs  perpendiculars). 
Ditto  for  2  elevation  cones.  If  the  azimuth  plane  and  elevation  cone  of  a  station  both  mlas  a  least-squares 
solution  (In  general  they  will),  their  Intersection  (the  missile  direction)  will  miss  It  farther  than  either 
surface.  It  follows  that  the  linear  errors  of  our  former  Odle  and  Bodwell,  and  recent  radar,  methods  were 
probably  larger  than  those  of  methods  which  least-square  the  original  observations.  So,  those  methods 
probably  degraded  physical  accuracy.  Our  Davis  cine  and  DOVAP  and  our  new  multistation-radar  methods 
arc  optimum  by  this  criterion. 

(4)  Summary.  It  seems  clear  that  the  criterion  of  a  totally-optlmum  reduction  can  be  met  only 
by  minimizing  the  sums-of-squarea  of  the  linear  perpendiculars  to;  azimuth-planes,  elevation-cones, 
range-spheres  and  loop-range-ellipsoids.  That  results  by  such  optimum  methods  will  be  somewhat  different, 
more  precise,  and  probably  more  accurate  than  by  our  current  methods. 
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Mr.  Darold  Comstock  suggested  accomplishing  the  proposed  least-squaring  of  linear  residuals  by 
multiplying  elevation  residual  by  slant-range  and  azimuth  residua)  by  ground-range.  This  is  radius  times 
(angle  in)  radians.  Algebraically  it  is  a  'weighting'.  Ceometr.cally  (and  physically)  it  ir  a  conversion.  The 
subsequent  algebra  must  be  changed  accordingly.  (It  is  still  necessary  to  propagate  these  linear  standard 
deviations  into  those  of  the  components  of  the  least-squares  point.) 

It  should  be  apparent  that  mechanism  3  interacts  with  mechanism  2  and  with  differences  in  slant 
range. 

Let's  sample  the  potential  of  linear  least-squares  for  improving  precision  (and  probably  accuracy). 

The  configuration  at  the  top  of  Figure  21  can  show;a  vertical  projection  of  2  stations  at  any  point  on 
theii  respective  'azimuth'  circles  about  OM;  or.  u  horizontal  projection  in  any  direction  about  point  0.  (S1 
can  also  ‘flop’  180°).  In  the  graphical  representation,  next:  starting  with  the  angular- LS  solution,  the 
‘direction-line’  is  allowed  to  swing  until  the  linear  residuals  are  equal.  Total  linear  error,  summation  to  point 
'T‘,  is  also  shown  for  each  method.  In  doing  this  arithmetic,  1  actually  used  the  slant  range  (or  ground 
range)  in  the  radius-times-radians  approximation  of  arc  for  perpendicular  to  projection  of  El-cone  or 
A/.-plane. 

Taking  it  slowly: 

Angular  least-gquares  yields  the  (arithmetic)  mean  of  the  angular  residuals.  (In  this  simple  case,  it 
makes  the  angular  residuals  equal.) 

Linear  least-squares  yields  the  mean  of  the  linear  residuals.  (In  this  simple  case,  it  makes  the  linear 
residuals  equal.)  The  angular  subtends  of  the  linear  residuals  are  not  equal,  because  their  unequal  scales  of 
observation  have  been  taken  into  account.  By  Incutring  a  little  bigger  error  with  respect  to  S',  we  minimize 
the  total  error.  (Optimizing  our  end-result  seems  a  little  unnatural  In  our  range  environment.)  The 
improvement  Is  38%  in  precision;  61%  in  variance.  In  a  redundant  case,  use  of  degrees  of  freedom  in 
calculating  these  would  increase  the  differences.  Finally,  the  change  in  the  position  component  would 
generally  lie  between  the  minds  10.1  and  plus  3.0  ft  ballparks  (of  the  changes  in  the  linear  precisions).  On 
the  average,  I  feel  it  should  represent  that  much  improvement  In  physical  accuracy. 

If  one  faces  the  linear  errors  behind  our  angular  errors,  to  the  point  of  a  linear  LS,  the  foregoing 
dependence  of  linear  error  on  slant-range  nullifies  an  assumption  of  'rigorous'  derivations  of  the 
least-squares  principle  -  that  the  variances  are  not  significantly  different.  Least-squares  still  yields  the 
minimum  vector-resultant  of  the  observed  errors. 

It  has  been  suggested  that  the  large  linear  deviations  should  be  weighted.  Perhaps  inversely  as  their 
slant-ranges!  In  effect,  angular  least-squares  uoes  that.  This  paper  suggests  wt  may  be  deceiving  ourselves,  in 
linear  measurement,  i '  we  least-square  angular  residuals  In  order  to  perform  our  least-squares  with  'nearly 
equal'  variances.  Slant-range  is  a  physical  variable  -  not  a  .tatistical  weight.  (The  differences  which  it  causes 
in  linear  error  are  not  due  to  random  sampling.)  Our  example  indicated  which  procedure  yields  the  smaller 
variance  of  our  end-result.  Implicit  Inequality  of  variances  in  our  angular-LS  apparently  does  more  harm  to 
our  result  than  If  this  inequality  were  minimized  by  linear-LS. 
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Mr.  Comstock's  r  oA  is  a  very  good  approximation  of  the  actual  linear-equivalent'  of  an  angular 
residual  (r  sin  oA  that  subttnder  of  aA  which  is  perpendicular  to  the  El-cone  or  Az-plan<). 

In  Figure  7,  you  saw  our  linear  precision  of  a  cine  vertical  component  better  than  the  linear  precision 
-  of  the  horizontal  components  (FURTHER  DATA,  above).  This  does  not  hold  true  for  the  corresponding 

t  angular  residuals  (the  angular  precision  of  the  vertical  runs  a  bit  worse).  It  appears  this  anomaly  ia  all  due  to 

*  the  interaction  of  our  angular  ‘Davis'  method  with  convergence,  slant-range,  and  the  azimuth-elevation 

i  system. 

j  As  you  can  see  in  Figure  22,  the  vertical  subtender  of  a  given  angular  error  increases  with  elevation 

angle,  for  a  given  horizontal  range.  In  the  top  drawing  this  actually  overcompensates  the  effect  of  the 
difference  in  slant-range  -  which  is  the  cause  of  the  difference  between  angular  and  linear  ‘Davis’  methods. 
This  compensative  case  is  the  whole  story  for  the  vertical  plane  -  that  half  the  story  for  the  horizontal  plane 
wherein  we  subtend  with  the  normal-to-the-line-between-the-stations.  The  bottom,  additive  case,  is  the 
other  half  of  the  story  for  the  horizontal  plane.  What  we  can  win  in  one  direction  we  more  than  lose  in  the 
I  other.  The  net  is  an  average  u/icompensation  of  the  equality  of  x  arid  y.  So,  our  Davis  precisions  of ::  and  y 

f;  do  not  approach  the  optimum  of  a  linear  least-squares. 

If 

v  This  appears  to  explain  the  better  precision  of  WSMR’s  vertical  component.  But,  how  much  of  the 

j;  effect  of  number  of  stations  can  it  account  for?  It  turns  out,  the  left  end  of  the  curve  of  o£  is  slightly 

steeper  relative  to  its  N-to-the-first-power  curve  than  It  the  left  end  of  the  curve  of  the  composite  O.  The 
right  end  is  slightly  shallower:  but  the  curves  arc  equally  close  to  N-to-the-first-power,  overall.  So,  the 
deficiency  of  our  angular  Davis  does  not  appear  to  be  a  large  part  of  the  particular  answer  we  set  out  to  find 
:  in  this  paper  (a  little  more  on  this  under  mechanism  5). 

!■  Let’s  combine  the  effects  of  slant-range  and  convergence  to  evaluate  the  net  Implications  for 

linear-vs-angular  LS  and  for  z  vs x,  y. 

‘  Our  first  equation  under  mechanism  2  approximates  any  station’s  separate  contribution  to  projected 

baseline-perpendicular  measures-of-goodness  of  its  measuring-convergence,  at  any  point  within-or-between 
the  baseline-normal  planes  which  pass  thiough  it  and  through  any  other  station  •  regardless  of  the  separate 
angles  of  their  projected  lincs-of-sight  with  the  baseline-normai  through  the  projected  object.  Repeating  this 
1  equation : 

r  sin  o. 

L  sin  9/2 

For  the  top  drawing  of  Figure  22,  it  turns  out  that  the  relationship  of  the  normal  subtenders  of  0A  is 
-  (sliderule  calculation): 


160 


1.002a, 


l30 
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(The  relationship  of  the  corresponding  chords  which  reflect  only  slant-range  is: 


•  60 


0.583C30.  •) 


So.  in  the  60°  •  30°  case,  the  normal  compensation  of  slant-tange  by  observing-angle  is  virtually  perfect. 


When  the  stations  are  not  equidistant,  the  opposite  diagonals  of  the  convergence  error-figure  are  no 
longer  perpendicular.  The  third  equation  under  mechanism  2  still  approximates  the  parallel  subtenders  in 
the  bottom  drawing  of  Figure  22.  It  turns  out  that: 


"30° 


2.945S, 


60° 


The  chords  are  unchanged  and  their  relationship  may  be  written: 

C30°  “  1.714C60-  0 

Considered  with  the  normal,  this  gives  some  feeling  for  the  net  uncornpeniation  of  slant-range  by 
observing-angle  in  the  horizontal  plane. 

For  our  LS  example  of  Figure  21,  it  turns  out  that  the  normal  tubtenders  of  oAis: 


•*'60° 


0.578O 


l\5° 


(The  relationship  of  the  corresponding  chords  which  reflect  only  slant-range,  is: 

C60"  =  0.299C,5»  •) 

So,  in  the  60°  - 1 5°  case,  the  normal  compensation  of  slant-range  by  observing-angle  is  quite  inadequate.  (In 
the  60°  •  45°  case,  there  is  over-compensation  ■  by  1.16.) 

It  appears  that  if  we  took  slant-range  into  account  In  our  estimation  of  position,  we  could  produce 
more  precise  and  accurate  data  (from  the  same  records). 

The  above  analytical  approach  has  demonstrated  potential  for  improving  position-precision  by 
overcoming  the  deficiency  of  an  angular  LS,  through  increase  in  sample-size.  (Toward  a  happy-medium 
slant-range.) 

So.  oc  =  f  (Nm).  where  m  =  f  (  ; ;  optimization  criterion; 

Let’s  call  this  mechanism  improvement  by  overcoming  nonoptimum  choice  of  variable-to-be- 
optimized. 
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4.  The  decrease,  through  increase  in  sample-size,  of  the  uncertainty  of  the  directional  aspect  of 
position  and  position-quality.  This  is  associated  with  the  increasing  probability  (as  we  increase  stations)  that 
the  .Vdimensional  least-squares  optimum  will  also  he  the  least-squares  optimum  for  each  component. 

In  simpler  language,  we  are  talking  about  the  probability  (at  each  data-point)  that  the  3-way-average 
position  will  also  he  the  average  position  in  k,  the  average  position  in  y,  and  the  average  position  in  z.  A 
small  sample  that  satisfies  these  3  conditions  seems  about  as  likely  as  3-cherries-in-a-row  (on  a  slot 
machine), 

An  analytical  approach  to  3-dimensional  vs  I -dimensional  sampling  is  not  within  the  scope  of  this 
paper.  The  following  table  gives  some  feeling  for  the  extent  to  which  these  differ. 


NUMBER 

%  LINEAR 

%  SPHERICAL 

OF  SIGMAS 

PROBABILITY 

PROBABILITY^ 

1 

68.3 

19.9 

2 

95.5 

73.9 

3 

99,7 

97.1 

In  connection  with  verifying  that  our  matrix  algebra  yields  the  standard-error  of  a  mean 
(ESTIMATION  OF  PRECISION,  above),  the  standard-deviation  and  standBrd-error-of-mean  of  each  set  of 
linear-component  equivalents  of  angular  residuals  will  be  calculated  about  Its  own  mean  (as  well  as  about 
the  least-squares  mean).  The  differences  in  the  variances  of  each  component  for  the  two  means  will  sample 
the  statistical-error  in  our  assumption  that  the  3-dimensional  least-squares  optimum  is  also  the  least-squares 
optimum  for  each  component.  The  component  differences  in  the  two  means  will  sample  the  bias  of 
measurement  of  each  component  relative  to  measurement  of  the  3-dlmensional  quantity  (or  vice  versa). 

The  above  discussion  has  indicated  a  potential  for  better  optimizing  the  precision  of  each  component, 
through  increase  in  sample-size.  (Toward  a  happy  3-dlmenilonal  medium.) 

So,  5C  =  t'  (Nm),  where  m  =  f  (;;:  3-dimenslonul  sampling; 

Let's  call  this  one  improvement  by  overcoming  our  Inability  to  optimize  each  dimension. 

S.  WSMR's  Final  Data  Reports  show  how  the  normal-distribution  value  of  1 .650 at  90%  probability 
increases  for  individual  standard-deviations  at  the  small  numbers  of  stations  used  for  a  position-estimate.  As 
tempered  by  dividing  out  the  corresponding  smaller  increase  of  the  factor  1,00  at  68.3%  probability, 
routinely  introduced  by  Data  Analysis  Directorate  (‘t'  statistic,  ESTIMATION  OF  PRECISION,  above), 
figure  23  lists  values  of  the  V  correction  at  08.27%  probability  (Ref.  1 1 ). 

fhis  paper  suggests  thal  our  t  ^->7  correction  tu individual  standaid-deviaiions  should  also  be  divided 
out  when  the  degrees-of-freedom  are  increased  by  averaging.  Tnat  the  average  position-precision  should 

*  'These  values  were  obtained  independently  by  Mr.  Gideon  Culpepper  and  by  Dr.  H,  H.  Germond. 
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then  he  multiplied  by  the  value  of  t  ^-,7  for  the  dcgrees-of-freedom  in  the  average  (where  this  value  of  t  is 
not  negligibly  close  to  unity). 

We  are  concerned  with  the  quality  of  our  data  in  the  form  that  it  is  reported  to  the  user.  Our  basic 
unit  of  position-quality  is  the  population-estimate  of  the  quality  of  each  component-value.  This  is  properly 
normalized  for  sample-size  and  reported  to  the  user  in  our  Final  Data  Reports. 

Our  test  (or  segment)  qualities  are  clearly  the  average  quality  of  their  individual  component-values.  If 
these  data-values  were  independent,  normally-distributed,  and  equivariant,  the  degrees-of-freedom  in  the 
average-quality  of  a  round  (or  segment)  would  be:  2N-3  times  the-number-nf-component-qualities- 
averaged.  Regardless  of  limitations  of  the  validity  of  these  statistical  assumptions,  this  apparent 
degrees-of-frecdom  in  the  average  is  still  our  best-available  estimate  •  incomparably  better  than  using  2N-3 
(the  average  degrees-of-freedom  of  an  individual  component-value).  Averaging  is  a  normalizing  process. 

Our  various  cumulative-average  qualities,  in  which  rounds  are  given  equal  weight,  are  practical 
approximations  of  the  cumulative-average  quality  of  their  Individual  component-values.  The  best-available 
estimate  of  their  degrees-of-freedom  is  2^-3  times  the-averagt-number-of-component-values-in-around 
times  the-number-of-rounds-averaged. 

Our  (equivalent)  linear-component  deviations  are  not  independent,  because  we  least-square  observa¬ 
tional-deviations  simultaneously  in  3  dimensions.  Component  variances  are  further  lacking  in  Independence, 
because  we  propagate  them  from  an  average  angular-variance.  Now,  effective  degrees-of-froedom  of  these 
less  than  their  2N-3  means  that  our  Individual  precision  estimates  are  too  good,  Also,  trajectory- 
measurements  closely-spacod-in-time  cannot  reasonably  be  assumed  independent.  Effective  sample-size  of 
averages  less  than  their  number-of-component-values  means  that  our  estimated  average-precisions  are  too 
good. 

It  is  planned  to  follow-up  Mr.  Charles  Bieking’s  suggestion  to  rms  random  samples  of  25-or-less 
position-component  sigmas  from  a  given  test:  also,  from  a  given  aeries-of-tests  -  to  get  around  lack  of 
normality,  independence,  and  homogeneity  among  our  component-values,  Then,  to  compare  these  and  our 
regular-average  sigmas  with  ASTM  control-chart  limits  for  their  respective  ‘sample-sizes’  •  to  sample  the 
error  in  our  assumption  that  our  data-values  have  the  above  properties,  Suggestions:  That  sample-size  which 
places  a  regular-average  sigma  in  the  same  proportional  relationship  to  its  control-chart  limits  as  the 
corresponding  random-sample  sigma  will  be  its  effective  sample-size.  (If  upper  and  lower  limits  yield 
different  values,  these  may  be  averaged.)  An  appropriate  set  of  sample-size  conversion-factors  for  our 
average -sigmas  can  be  generated  in  this  manner.  Use  of  effective  sample-size  will  yield  valid  average-sigmas. 
(An  existing  average  may  be  corrected  by  dividing  by  the  square-root  of  its  sample-size  converslon-factot.) 
One  minus  the  ratio  of  an  average-sigma  calculated  from  its  apparent  sample-size  to  its  value  calculated 
from  its  effective  sample-size  will  be  an  estimate  of  the  cocfficient-of-linear-correlation  of  its  component- 
values.  (More  simply  obtained  us  one  minus  the  square-root  of  its  conversion -factor.) 
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t’rof.  William  Kruskal’s  Comments  on  my  paper  show  how  negative3'  correlations  of  observation* 
increase  the  precision-response  exponent. 

The 't'  correction  normali7.es  the  distributions  of  samples.  In  Figure  23,  the  larger  values  of  t  at  low 
degrees-of-freedom  reflect  our  inability  to  know  the  quality  represented  by  one  small  sample.  But,  we  know 
the  round-  and  cumulative-average  qualities  of  our  low-degrees-of-freedom  data  quite  well. 

The  foregoing  implies  that  the  WSMK  average-precision-vs-number-of-stations  data  In  Figures  2  and 
4-16  should  be  divided  by  the  values  of  t  in  Figure  23,  That  the  average- precision  of  WSMR’s  2-station  (1 
degree-of-freedom)  optical  position-data  is  substantially  better  than  has  been  realized.  3-station  (3 
degrccs-of-lrccdom),  somewhat  better.  And,  so  on.  (Similarly  for  DOVAP.) 

On  this  basis,  Figure  24  is  the  corrected  version  of  Figure  9  (Redeye  Contraves-precision  vs 
number-uf-stations  and  its  exponential  models).  Comparing:  Between  2  and  3  stations,  our  curve  went  from 
somewhere  among  its  N’  and  N3  n  models  to  somewhere  among  Its  Ns/1  and  N  modelt.  Between  3  and  4 
stations,  our  curve  went  from  N*  to  N3-1.  Between  4  and  S  and  S  and  6  stations,  there  was  little  change. 
So,  mechanism  5  accounted  for  N3M  between  2  and  3  stations  and  for  N1  ,J  between  3  and  4  stations.  (For 
nothing  between  5  and  6  and  6  and  7  stations.) 

Figure  25  is  the  correspondingly  corrected  version  of  Figure  II  (Redeye  target  Contraves-precision  vs 
number-of-stationsand  Its  models).  Our  'fitted'  curve  went  from  being  closest  toN3/3  to  being  closest  toN. 

It  turns  out  that  mechanism  5  changes  the  upper-half  of  our  Lance  DOVAP-preeision  curve  (Figures 
1 5  and  16)  from  proportionality  to  N  to  proportionality  to  N1  /s ,  The  lower-half  of  the  same  curve  remains 
at  proportionality  toN. 

Mechanism  5  leads  me  to  pu  t  more  weight  on  the  right-half  of  my  comparisons  of  precision  of  7.  and  of 
x,  y,  z  (Figure  7)  with  their  models.  The  right-half  of  the  curve  of  a.t  Is  slightly  shallower  relative  to  its 
N-to-the-flrst  power  curve  than  is  the  right-half  of  the  curve  of  the  composite.  This  appears  to  show  some 
effect  of  mechanism  3  on  the  power  of  the  precision  response  (more  room  for  overcoming  the  Horizontal 
plane's  net-uncompensation  of  differences  in  slant-range), 

So.  oc  *  f  (Nm),  where  m  ■  f  (;;;;  estimating  degrees-of-freedom; 

Let’s  call  this  one  improvement  by  overcoming  nonoptimum  estimate  of  degrees-of-freedom, 

6.  The  open-ended  residue  of  other  possible  mechanisms  of  improvement  of  flight-measurement 
precision  by  increasing  stations. 

In  this  paper,  we’re  talking  about  how  fine  a  tolerance  we  can  meet  by  increasing  (and  optimizing) 
comparisons  in  flight-measurement.  Number  of  possible  minimum  solutions  expresses  the  number  of 
possible  comparisons  of  observations  on  the  common  basis  of  our  end-result.  For  optics: 


31  When  one  increases  the  other  decreases.  This  is  commonly  the  case  in  3-dimenslomil  LS  solutions. 
These  are  usually  influenced  by  measurement  in  one  dimension  only  at  the  expense  of  measurement  in 
the  others. 
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NUMBER  OF 

NUMBER  OF 

D.  F.  OF 

NUMBER  OF 

U  F.  OF 

STATIONS 

OBSERVATIONS 

OBSERVATIONS 

SOLUTIONS 

SOLUTIONS 

'h 

1 

2 

1/3 

2/3 

I 

■> 

1 

2/3 

-1/3 

Itt 

3 

0 

1 

0 

2 

4 

1 

4 

3 

2VS 

5 

1 

9  or  10 

8  or  9 

3 

6 

3 

19 

18 

etc 


It  turned  out  that  replacing  numbcr-of-ohservations  by  degrces-of- freedom  (degrces-of-redundancy)  of 
observations  in  the  models  of  Figure  3  gave  curves  whose  shapes  were  less  like  our  operational  curves. 
Curves  based  on  degrces-of-frecdom  (degrees-of  rcdundancy)  of  solutions,  and  on  number-of-solutions. 

would  be  still  less  10. 

However,  number  of  possible  comparisons  and  number  of  independent  comparisons  (D.F.  of 
solutions)  may  be  relevant  to  learning  the  degree  to  which  stations  are  capable  of  mutual  calibration  (to 
improve  system-precision)  and  of  calibrating  one  another  (EFFECT  ON  PHYSICAL  ACCURACY,  below). 

Figure  26  shows  the  frequency-distributions  of  the  squares  of  the  precisions  which  were  sorted  by 
number-of-stations  in  Figure  2.  It  appears  that  correlation  of  other  determinants  of  position-quality  with 
number-of-obsetvations  clearly  corresponds  to  contraction  of  extended  frequency-distributions  of  variance 
through  increasing**  degrees-of-frecdom  chi-squaied  (or  less-skewed  normal)  distributions,  as  number-of- 
stations  increases.  This  change  of  shape  is  an  effect,  not  a  cause. 

Figure  27  summarizes  the  five  clear-cut  mechanisms  of  this  paper,  plus  its  open-ended  ‘catchall’.  My 
present  guess  is  that  the  ordei  of  descending  (but  real)  magnitude  of  the  effects  of  the  first  five  mechanisms 
is:  2;  I ;  5;  tie  between  3  and  4. 

RATING  CONFIGURATIONS  OR  STATIONS,  Suggested  procedure: 

1 .  Generate  a  more-complete  version  of  the  1st  and  3rd  columns  of  the  first  table  under  mechanism 
2,  above.  Also,  a  more  complete  version  of  the  1st  column  and  of  twice  the  3rd  column  of  the  third  tabic 
under  mechanism  2.  This  is  properly  done  with  a  trig,  table  and  a  sliue-rule. 

2.  On  a  reproduction  of  a  map  or  scale-drawing  of  a  given  (or  proposed)  configuration,  draw  all 
possible  sides  and  diagonals.  (A  diagonal  can  be  external.)  Estimate,  and  'x‘,  the  configuration's 
center-of-gravity  on  the  basis  of  visual  judgment  -  aided  somewhat  by  little-circles  around  all  intersections 
of  its  diagonals.  The  configuration  should  be  chosen  so  the  nominal  trajectory  will  pass  near  its 
center-of-gravity. 


4*  In  the  sense  of  affording  a  choice, 
**  But  effectively  very  low. 


.V  For  some  given  (or  proposed)  altitude  above  tliis  C'.G.  ■  or  above  the  nearest  point  to  it  on  the 
nominal  trajectory  •  approximate  the  projected  vertical  convergence  for  each  possible  (2-station)  baseline 
as: 

baseline 


2  arc  tun 


2  (altitude) 


where  each  baseline  is  measured  graphically.  The  altitude  will  generally  be  that  of  the  nominal  trajectory  at 
the  midpoint  of  this  planned  segment. 

4.  Look  up  in  the  first  table,  list,  and  add  the  variances  of  these  vertical  convergences. 

5.  Approximate  the  projected  horizontal  convergence  for  each  possible  baseline  as: 

baseline 


0  *  2  arc  tan 


2  (perpendicular  distance) 


where  its  1  distance  is  measured  graphically  from  the  C.G.  -  or,  better,  from  the  nearest-point-to-it  under 
the  nominal  trajectory.  Or,  preferably,  read  these  horizontal-convergences  with  a  protractor.  (Using  the 
'nearest-point'  includes  the  nearness  of  the  configuration's  C.G.  to  the  trajectory  In  the  rating.) 

6.  Look  up  in  the  second  table,  list,  and  add  the  double-variances  of  these  horizontal-convergences. 

7.  Combine  totals  from  steps  4  and  6. 

8.  For  a  given  number  of  stations,  pick  the  configuration  with  the  smallest  total  variance. 

9.  In  adding  a  station  to  a  given  configuration,  pick  the  station  for  which  the  sum  of  its 
convergence-variances  (with  all  the  others)  is  the  smallest. 

the 

10.  In  deleting  a  station,  drop  the  on^sum  of  whose  convergence-variances  is  the  biggest. 

The  above  proximate  method  should  work  fairly  well,  because  of  the  big  variances  of  bad 
intersections.  It  is  valid  for  optics  and  radar.  Possible  refinements  include: 

(1)  Finding  the  true  C.G.  by  graphical  or  analytical  methods. 

(2)  Multiplying  each  vertical-convergence  variance  by  the  square  of  its  projected  slant-range: 


,  f  baseline*  V  ....... 

r2  =  f  \  J  +  (altitude)' 
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And,  iiiulli|>iyiiigeai!t  Siuii/oiUal-uuiivvi^ciicC  vaiiaiicc  by  the  square  »>f  its  projected  ground-range: 


r 


2 


base  line 


-) 


2 


+  (i  distance)2 


(The  latter  t  may  also  be  obtained  by  averaging  the  graphical  distances  from  the  ends  of  the  baseline  to  the 
C.Ci  •  or  to  the  ncarest-point.) 

The  two  tables  described  under  step  I  arc  optimum  Jigures-of-nterir  of  the  various  convergence-angles. 
As  such  they  arc  suitable  for  general  use.  The  overall  measuring-effectiveness  of  most  actual  convergences 
will  fall  short  of  their  projected-equul-haseline  optima.  But.  losses  in  going  to  unequal  slant-ranges  and  to 
convergences-external-to-the  baseline  should  be  reflected  well  enough  for  ordinary  purposes  through  the 
badness  of  such  convergences. 


COMPONENTS  OF  COVARIANCE  (GDOP  MADE  EASY),  ‘Geometric  dilution  of  precision'  most 
usefully  refers  to  the  geometric  components  of  position-measurement  variance.  Somewhat  less  definitively: 
\  .  .  the  magnitude  of  the  position  errors  caused  by  random  measurement  errors  .  .  .  depends  on  the 
particular  parameters  measured,  the  measurement  system,  location  of  the  measuring  equipment  and  the 
location  of  the  missile  with  respect  to  the  equipment.  Variation  of  the  effect  of  random  errors  Is  measured 
by  a  quantity  defined  as  the  Geometrical  Dilution  of  Precision  (GDOP).'  (Ref.  12). 


From  MECHANISM  2,  above,  our  equations  for  the  vertical  and  horizontal  projections  of 
linear-measurement  variance  are: 


r2  sin2  aA 

sin2  0/2 


2o„2 


sin2  aA  (- 


1 


1 


sin T“o7T 


cos1  9/1 


) 


These  apply  when  the  missile  is  in  the  plane  normal  to  a  2-optical-station  baseline  at  its  midpoint.  The  first 
equation  also  applies  to  any  optical  stutlon’s  separate  contribution  at  any  point  wlthin-or-between  the 
baseline-normal  planes  which  pass  through  it  and  through  arty  other  station  -  regardless  of  the  separate 
angles  of  their  projected  lines-of-sight  with  the  baseline-normal  through  the  projected  object  These 
equations  also  serve  fairly  well  for  radar  under  these  conditions. 

ov2  and  2ot(2  are  the  vertical  and  horizontal  components  of  position-measuring  variance  under  the 
conditions  described. 

sin2  oA  is  the  instrument-component  of  both  vertical  and  horizontal  position-measuring  covariance, 
under  these  conditions.  The  remainders  of  the  right-hand  sides  or  the  above  equations  arc  the 
geometric-components  of  vertical  and  horizontal  position-measuring  covariance  (under  the  conditions 
described),  r2  is  the  ‘scale’  subcomponent  of  the  geometric-components  of  both  vertical  and  horizontal 
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pobit ion -measuring  covariance.  >^3  $/2  anil  the  parenthetic  trig,  expression  are  the  ‘configuration’ 
subcomponents  of  the  geometric-measuring  covariance.  The  latter  are,  then,  simple  trig.-functions  of 
convergence-angle.  (The  square  roots  of  all  the  above  are  the  'components'  of  position-measuring  precision 
or  copncision.)  Our  Davis  erne  method  docs  not  optimize  the  geometric  components  of  position-measure¬ 
ment  covariance, 

For  a  given  instrument-precision,  the  above  2  equations  are  the  vertical  and  horizontal  ‘GDOP’ 
equations  •  lor  these  conditions.  They  arc  a  bridge  between  partial-differential  equations,  their  matrix 
representation.  and  nonspccialists  -  for  the  2-station  optimum  (baseline-midpoint)  set. 

REDUNDANT  PRECISION  OF  NONREDUNDANT  SOLUTIONS.  It  is  operationally  necessary  to 
compare  the  quality  of  nonredundant  solutions  for  missile-position  with  the  quulitv  of  redundant  solutions. 
(Examples  of  nonredundant  solutions  arc:  I -station  radar,  I  4-station  optics,  and  3-station  DOVAP.) 

The  uncertainty  of  a  nonredundant  (zero-degrees-of-freedom)  solution  is  not  a  problem  if  one  has 
sufficient  Irifo  oh  tHe  statistical  population  of  which,  the  solution '{»  a'  sample  (of  size  one)  •  or  on 
comparable  populations. 

‘I  v.  . 

The' commonest  example  of  d  nonredundant  solution  is  a  single  observatlcin.. of  8  oii8-dimensional 
'quan  tity.  The'  concept,  of  the  standard  deviation  of  (single)  observations  Is  the  best-known  of  all  precision 
concepts,  it  is.  of  course,  the  characteristic  statistical  uncertainty  of  a  single  observation, 

.  IJsudlly,  the  parameters  of  the' populatlon-of-observations  arc  estimated  from  a  simple  of  (several) 
observations.  The  precision  of  the  sample-mean  is  then  given  by  the  relationship: 


■  where'  Is  the  estimate  of  the  stamlard-deviation-of-obseryations,  and  n  is  the  sampic  size  (Ref.  4). 

If  only  the  precision  of  the  mean  of  a  given  number  of  someones  observations  (of  a  one-dimensional 
quantity)  Is  available,  it  is  routine  to,  use  this  equation  to  calculate  the  uncertainty  of  a  single 
(nonredundant)  observation.  > 

In  flight-measurement  terms,  the  above  equation  takes  the  form  of  equation  (5)  of  our  Figure  3.  Or  its 
working  form  -  equation  (I )  of  the  same  figure.  Empirical  generalization  of  the  latter  may  be  written: 


M  k  m 

»c  -  •  k 


And,  we  have  found  that  we  can  determine,  by  trial-und-crror,  a  power  of  the  ratio  of  numbers  of  stations 
which  makes  this  equation  closely  fit  u  segment  of  a  given  plot  of  average-precision  vs  number-of-stations. 
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From  the  foregoing,  this  relationship  lor  I  and  2  dcgrees-of-freedorr.  position-data  can  clearly  be  used 
to  calculate  the  average  component-precision  of  the  corresponding  zero-degrees-of-freedom  (nonredundant) 
solution.  As  our  reductions  are  presently  carried  out,  this  means  using  2-  and  3-station  precisions  of  radar  or 
optics  or  4-  and  S-station  precisions  of  DOVAP.  (3-station  optics  is  actually  3  degrecs-of-frcedom  and 
3-station  radar  is  6.)  The  result  will  be  as  representative  as  the  1  and  2  (or  3)  degrees-of-freedom  data.  The 
solution  of  the  above  equation  for  the  exponent  reduces  to: 


which  can  be  carried  out  to  the  degree-of-fineness  desired  (three  digits  arc  ample).  The  value  of  m  for  1  thru 
2  degrees-of-fretdom  is.  of  course,  used  in  the  previous  equation  with  the  I  degree-of-freedom  data  to 
calculate  the  zero-degrees-of-freedom  precision.  If  the  I  degree-of-freedom  data  ore  limited  to  a  particular 
combination  of  stations,  the  result  will  be  average  for  those  stations  -  slightly  influenced  by  the  makeup  of 
the  second  (or  third)  degree-of-freedom. 

if  one  desires  to  calculate  average  x.  y,  and  z  precisions  of  nonredundant  solutions,  the  above  may  be 
carried  out  separjtely.for  average  x,  aveiage  y,  and  average  z  precisipns. 

If  one  desires  tn  calculate  average  precisions  of  the  observations  of  azimuth,  etc.  in  a  nonredundant 
solution,  the  above  may  be 'lurried  out  separately  for  average  azimuth,  etc.  residuals  from  1  and  2 
degrees-of-freedom  solutions.  (Preliminary  Indications  are  that ,  in  our  process,  time-rmsd  deviations,  of  an 
observational  parameter  are  not  independent  of  number-of-stations.) 

If  one  desires  to  calculate  average  component-precisions,  or  x,  etc.  precisions,  or  azimuth,  etc. 
precisions  of  a  particular  nonredundant  solution,  the  above  may  be  carried  out  separately  for  average 
component  residuals,  average  x,  etc.  residuals,  or  average  azimuth,  etc.  residuals  of  that  particular 
nonredundant  solution,  from  1  and  2  degrees-of-freedom  solutions.  These  precisions  will  ofcourse  be 
somewhat  influenced'by  the  makeup,  of  the  first  (and  even  second)  added  degrees-of-freedom. 

Combining  x,  y,  and  z  precisions  of  nonredundant  solutions  calculated  by  any  of  the  above,  as  an  rms, 
then  comparing  with  the  similarly-calculated  corresponding  average  componen  t  precision  can  serve  to  check 
the  coherent  execution  of  the  methods.  (When  desired  coherence  is  not  attained,  the  best  result  is  the  rms 
of  the  two.) 

The  foregoing  applies  equally  well  to  nonredundant  solutions  lor  attitude  ■  or  for  any  other  measured 
missile-flight  variable. 

The  above  method  is  empirical  only  to  the  degree  that  improvement  of  the  particular  precision  by 
increasing  stations  exceeds  the  purely  repetitive  (statistical)  amount. 

So,  thete  are  no  conceptual  difficulties  •  physical  or  statistical  -  in  calculating  the  average-precision  of 
nonredundant  solutions  on  the  same  (WSMR  standard)  basis  as  the  average-precision  of  redundant 


solutions.  The  former  may  he  calculated  from  the  latter,  lor  the  same  or  comparable  measuring  situations. 
Where  there  is  oveilap.  the  result  may  be  used  in  quality-control  of  the  nun-redundant  solutions.  Where 
redundant  precisions  from  comparable  current  measuring  situations  must  be  used,  the  result  is  still  a  basis 
for  stating  nonredundant  capability. 

GEOMETRICAL  STATISTICS.  Geometry  is  used  here  in  its  plane,  solid,  and  mensuration  senses. 

I  am  inclined  to  consider  geometrical  statistics  a  boundary-discipline  of  geometry  and  numerical 
statistics.  Our  concern  here  is  with  the  statistics  of  geometry  (not  with  the  geometry  of  statistics). 
Regardless  of  terminology,  our  subject  includes  the  following  elements: 

I  Individual  measurements  of  object-position  define  geometric  surfaces  in  space.  (Or  geometric 
curves  in  the  plane  of  object  and  observer.)  T  he  particulars  of  these  for  a  station  are  its  own  coordinate 
system.  The  particulars  of  these  for  u  system  are  its  reduction  geometry.  In  general,  3  Individual 
measurements  of  magnitude  or  direction  determine  a  space-position0*.  The  ffA  of  our  Davis  cine  method  is 
a  parameter  of  an  error-ellipsoid  which  is  definable  only  in  terms  of  the  particular  measurement- 
configuration.  This  paper  suggests  our  use  of  an  average  angular  precision  evades  the  Issue  of  how  the  .1 
degrees  of  freedom  used  by  the  LS  determination  of  a  particular  3-dlmonslonul-Carteslan  position  are 
effectively  distributed  between  azimuth  und  elevutlon;  and,  hence,  how  they  should  be  distributed  for 
estimating  the  separate  effective  precisions  of  azimuth  und  elevation  (which  would  be  more  useful),  This  Is 
a  question  of  measuring  the  relative  degree  to  which  x  Is  determined  from  azimuth  and  from  elevation; 
ditto  for  y.  (/  competes  with  x  and  y  for  elevation.  We  can  consider  It  subtracts  1  degree-of-freodom  from 
whm  they  leave.)  Because  of  the  smaller  separate  sample-sizes,  and  because  of  the  physical  Imbalance  of 
.1-dimensional  dcgrocs-of-freedom  between  azimuth  und  elevutlon6*,  it  may  be  concluded  our  average- 
angular  precision  makes  our  separate  angular  measurements  look  a  little  better  and  a  lot  more  alike  than 
they  are, 

it  Is  suggested  that,  in  evaluating  effective  azimuth  und  elevation  precisions,  on?  would  apply  their 
sepurutc  't*  corrections.  And,  that  one  would  bypass  or  divide  out  said  corrections  for  qualfty-contro) 
averages  of  these  separate  precisions,  Perhaps  the  method  of  generating ‘t’  tables  will  work  for  a  fraction  of 
one  degree-of-freedom!  (And  for  fractional  interpolation.) 

The  above  suggestion  differs  from  present  practice  in  4  ways;  The  problem  of  determining  the  separate 
effective  dcgrecs-ol'-frecdom  of  azimuth  and  elevation;  QOuveruging  of  estimates  of  station -quality  which 
would  be  based  on  degrees-of-freedom:  estimating  separute  effective-precisions  of  azimuth  and  elevation  for 
each  duta-point;  the  problems  of  propagating  these  precisions  of  uzimuth  and  elevation  into  effective- 
precision  of  the  x-eomponent  and  effective-precision  of  the  y-eimponent,  and  of  this  elevation  precision 
into  effective-precision  of  the  /-component.  It  may  also  he  concluded  our  uverage-angular  precision  •  as  fur 
as  it  goes  ■  makes  our  3  precisions  of  linear-measurement  look  a  little  better  and  a  lot  more  alike  than  they 
arc 

^Azimuths  alone  eunnoi  determine  the  vertical  coordinate.  Although  It  is  not  widely  realized,  elevation 
residuals  have  x  and  y  components  as  well  as  z  (At  high  elevutlon-ungles,  they  are  mostly  x  and/or  y!)  Of 
course  (in  the  physical  sense  that  is  relevant  here),  azimuth  residuuls  have  only  x  and  y  components. 
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Suggested  a/'priw  h  in  ivlalive  degree  \  depends  on  azimuth  and  on  elevation;  ditto  I'm  y.  (And  to 
how  they  divide  up  azimuth.)  '  aleul.ite  separate  precisions  of  azimuth  and  elevation  based  on  N  ■  3/2 
degrees-of-freedum.  Propagate  ca  It  of  these  sigmas  into  (lie  x -component  of  the  LS-solulion  by  multiplying 
it  by  the  rms  of  the  partial-derivatives  of  the  v-eomponcni  with  respect  to  its  selof-angular-measuremenls. 
The  ran,*  of  the  variance  of  the  x-coi.iponcnt  propagated  from  azimuth  to  the  sum  of  llie  variances  of  the 
x-'.-omponent  from  azimuth  and  from  elevation  is  i he  fractional  dependence  ot  x  on  azimuth.  One  minus 
that  is  its  tiactional  dependence  on  elevation.  Ditto  lor  y  (The  ratio  of  the  variance  of  the  x-component 
from  azimuth  to  the  sum  of  the  variances  of  the  x-  and  ycomponents  from  azimuth  is  the  x-fruetion  of 
azimuth.  One  minus  that  is  its  y-i'ractiun.)  N  minus  the  fractional  Dependences  of  x  and  y  on  azimuth  is  the 
degrccs-ol-freedom  to  use  mi  lecalciviating  the  ellective-precisiott  of  azimuth.  tN  -  I)  minus  the  fractional 
dependences  ol  x  and  y  on  elevation  is  the  degrees  ol -I roedom  |>  use  i;i  recalculating  the  precision  ol 
elevation.  Iterate  propagation  of  these  tevised  sigmas  into  x-  and  y-  components  •  as  many  times  as 
necessary.  (One  iteration  tnay  suffice.)  Effective  -sigma  of  the  /-component  is  die  rms  of  its  final 
effective-sigmas  from  azimuth  and  from  elevation.  Ditto  for  effective-sigma  of  the  y -component.  Propagate 
final  effective-sigma  of  elevation  into  the  /-component,  to  get  the  latter's  effective-sigma. 


One  of  the  operations  of  geometrical  su.isties,  then,  is  traiisfovniarion  of  precision  indices  from 
instrument-coordinates  to  data-coordimites.  Out  Davis  cine  method  accomplishes  this,  in  its  way, 
simultaneously  with  (multivariate)  statistical  transformation  of  the  precision  indices  front  observations  to 
data.  This  paper  suggests  that  its  procedure,  above,  would  do  the  saute  thing  imit'h  more  validly, 


The  foregoing  suggestion  would  also  apply  to  the  totally-optimum  linear  least-squares  proposed  under 
MHCHANISM  3,  above.  The  computing  would  simplify,  since  propagations  would  be  only  (multivariate) 
statistical  transformations  of  precision  indices  from  observations  to  data, 


A  second  element: 


2.  Geometric  convergence,  This  element  was  treated  tinder:  MhCHANISMS  2,  3;  RATING 
CONFIGURATIONS  OR  STATIONS;  and  COMPONENTS  OF  COVARIANCE  (GliOP  MADE  EASY). 
These  effectively  dlslmbetl  GDOP  from  its  matrices,  in  forms  that  arc  more  easily  understood  anil  used. 
Tills  was  done  hy  graphical  representation  and  elementary  trigonometry.  Jeffreys  (  Ref.  13)  lias  shown  that 
(the)  trig,  functions  ure  laws  of  (physical)  monsurniion,  not  merely  mathematical  definitions. 

It  war.  feasible,  m  this  paper,  to  deal  with  the  projected  planar  convergence  of  the  linear  intersections 
of  the  two  angular  observations  of  earh  of  two  optical  (or  radar)  stations.  This  apfimaclt  is  certainly  one  of 
the  practical  optimums,  from  MM'HANIS.M  i.  dealing  simultaneously  with  the  multiple  spatial 
convergence  of  the  mdiriduiii  ohsotvaiinnjhwri'avvs  "t  llu-  total  nuiniier  of  simuilatieously-lracking  stations 
would  be  the  statistical  and  quality  optnmm  (criteria  I  and  t  of  Figure  30).  In  Figure  IN  0^  can  her;A/ 
nr  0|.  |.  The  values  of  these  may  differ,  hut  the  value  of  each  must  he  the  same  for  both  stations. 
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A  linnl  do iiion i 

A.  (Tliouromt  lnially-optiirmm  measurement  analysis  optimizes  the  geometric  components  of 
position-measurement  covatiancc,  as  well  as  the  instrument  components.  Taking  slant-range  into  account - 
transforming  residuals  from  station-variable  to  missilc-variahle  -  before  least-squaring  was  treated  under 
Ml  (  NANISM  A .  Taking  convergence-angle  into  account  before  least-squaring  is  treated  in  this  paper  only 
to  the  point  of  station-selection  (RATING  CONFIGURATIONS  OR  STATIONS). 

4.  (Theorem)  The  multidimensional-optimum  position  is  generally  not  optimum  for  any  single 
dimension.  This  element  was  treated  under  MLTTIANISM  4. 

As  long  as  we  deal  with  vectors  in  terms  of  their  components,  it  is  valid  to  characterize  precision  by 
three  I -dimensional  frequency  distributions.  Hut.  if  one  deals  with  a  3-dimcnsional  vector  as  an  entity  (e.g., 
in  vector  analysts),  the  standard  deviation  ol  /«  value  (quality  ol  the  vector)  is:7^ 


where  X,  g.  and  v  are  the  vector's  respective  direction  cosines.  Our  t  (,($17  corrections  to  individual 
component-sigmas  should  he  divided  out  beforehand. ^  This  formula  applies  to  any  performance  variable. 
Tor  the  case  where  cr^1  ~  uyJ  -  a.f  2 .  the  direction  cosines  drop  out.  For  the  position  vector,  <7V  is  how 
well  we  know  the  distance  of  ihc  missile  from  its  launcher  reference.  It’s  the  precision  of  the  radius  vector 
(r)  of  spherical  coordinates.  (So,  the  above  is  one  of  the  3  geometrical-statistics  formulas  required  to 
transform  precision  indices  from  Cartesian  coordinates  to  spherical.)  For  the  velocity  or  acceleration  vecior, 
ov  is  (correspondingly)  the  precision  of  the  missile's  radial-velocity  or  radial-acceleration.  The  formula 
without  the  direction-cosines  is  also  the  vector  representation  of  A-dimensionai  precision  (vector  of  the 
quality). 

Though  not  commonly  considered  a  geometric  dimension,  overcoming  effective  time-measurement 
differences  urntaig  station-observations  -  toward  a  happy-medium  difference  •  is  part  of  our  open-ended 
MKC’IIANISM  ft.  This  is  not  limited  to  differences  in  the  time-signal,  but  includes  all  differences  In 
synchronization  throughout  the  data-process  (Ref.  ft).  It  shows  up  as  an  unbalanced  contribution  to 
position-measuiemcnt  variance  (maximum  in  the  direction  of  the  velocity-vector  -  zero,  normal  to  that). 

5.  (Finding)  In  flight -measurement,  curves  of  average  precision  vsnumber-of-observing-stations  may 
he  lifted  by  generalizing,  to  u  variable  degree  the  exponential  dependence  of  the  standard-eiror-of-the- 
mean-of-a-norntal-disirihution  on  sample-size.  The  exponential  improvement  of  position-precision  by 
increasing  stations  ran  he  .is  much  as  4  times  that  of  the  classical  relationship  of  numerical  statistics.  (2 
projects  in  which  the  improvement  was  A  times  are  disregarded,  because  duta  were  somewhat  limited.) 

In  this  sense,  at  least,  geometrical  statistics  is  a  generalization  of  numerical  statistics.  In  addition  to 
numerical  convergence,  the  above  exponential  dependence  of  precision  response  is  clearly  a  function  of 

7 *  This  equation  was  derived  for  the  wrttei  by  Mr.  W,  T  Mimmack.  1 1  checks  a  classical  source  (Ref.  15). 

To  normalize,  the  3 -dimensional  sigma  should  he  multiplied  by  a  3-dimensional  ‘t’.  These  am  be  deduced, 
loi  either  the  general  or  equivalent  case,  from  Reference  Ift. 
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geometric  convergence  (the  configuration  subcomponent  ol  the  geometric  component  of  position- 
measuring  covariance):  is  apparently  a  lunction  ol  slant-iange  (the  scale  subcomponent  of  the  geometric 
component):  ami  is  probably  a  function  of  multidimensional  sampling.  It  may  also  depend  to  some  extent 
on  precision-indices  being  variant  under  coordinate-transformation  •  and  on  lime-measurement. 

A  negative  view  might  he  that  classical  numerical  statistics  is  not  applicable  to  flight-measurement, 
hecause  observations  and  component-values  of  flight-data  lack  normality,  independence,  and  homogeneity. 

This  paper  suggests  that  the  ‘random-error'  concept  of  numerical  statistics  is  not  adequate  hv  itselj'ior 
flight-measurement.  Thai,  besides  more  concern  with  the  physical  and  geometrical  meanings  of  our  mailt, 
more  formal  development  of  the  boundary-discipline  of  geometrical  statistics  might  he  helpful. 

The  findings  of  this  paper  logically  lead  to  the  following  staiistical  'heresies’: 

(1)  We  should  average  our  precisions  in  whatever  exponential  form  is  proportional  to 
sample-size  for  that  vehicle.  This  calls  for  an  averaglng-spectrom  ranging  from  averaging  variances  to 
averaging  square-roots  of  standard-deviations. 

(2)  Our  results  call  for  an  optimizing  spectrum  ranging  from  least-sqoarcs  to  least-squure-roots. 

We  aren't  doing  these  heretical  things  hecause:  it's  inconvenient;  we  want  to  maintain  our  bridge  to 
standard  methodology;  it’s  desirable  for  a  data-quality  index  to  be  sensitive  to  bad  data  (as  variance  is). 

Each  of  the  foregoing  elements  of  geometrical  statistics  Is  also  relevant  to  station  and  system 
calibration  and  residual  bias,  and  their  statistical  uncertainties.  (See  EFFECT  ON  PHYSICAL  ACCURACY, 
below,) 

EFFECT  ON  PRECISIONS  OF  VELOCITY  AND  ACCELERATION.  Figure  28  shows  average 
precision  of  velocity  of  Chaparral  missile  (Feb  -  Nov  ’07)  vs  average  number  of  Contraves  stations 
computed,  Its  hyperbolic  models  show  It  closest  to  improvement  of  precision  in  direct  proportion  of 
riumber-of-stations. 

Over  6  sets  of  Contraves  and  one  set  of  DOVAP  data,  there  was  a  slight  tendency  for  velocity  precision 
to  depart  from  proportionality  to  N  in  the  direction  of  N1  n .  Also,  a  slight  tendency  for  the  curves  tube 
straighter  than  the  hypetholic  model. 

Figure  2V  shows  average  precision  ol' acceleration  of  Redeye  target  (Mar  •  May  ’62)  vs  average  number 
of  Contraves  stations  computed.  The  upper  hall  of  the  curve  was  closest  to  improvement  in  proportion  to 
N;  the  lower  hall  closest  to  N'1  '• .  Overall,  it  was  a  lossttp  between  those  two. 

Over  6  sets  of  Contraves  and  one  set  of  DOVAP  data,  acceleration  precision  w»  evenly  divided 
between  proportionality  to  N  and  proportionality  to  N1,1  More  than  half  the  curves  tended  to  be 
straighter  than  their  closest  hyperbolic  model.  The  tendency  for  average  quality  to  be  influenced  by  a  bad 
round  increased  in  going  from  position  to  velocity  to  acceleration 
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It  planned  to  look  at  the  above  i v!u iiiniaiii p  inr  smoothed  position,  which  occurs  in  our  process 
between  measured  position  and  velocity. 

Propagating  i he  effect  of  number-ol  stations  on  precision-of-measurcd-position  through:  time  series, 
poluiomial  fit,  lack-of-lit,  1st  derivative,  and  2nd  derivative  is  not  attempted  in  this  paper.  It  is  reasonable 
that  the  precision  of  a  position  time-series  reflects  to  a  considerable  degree  the  precision  of  its  values.  (And, 
so  on.) 

Some  of  the  variability  of  our  observations  is  ’short-period',  so  their  numerical  convergence 
(MECHANISM  I,  above)  should  be  reflected  to  some  extent  in  time-varying  precisions.  The  difference 
between  d-dimensional  and  l-diinciisioiiul  sampling  (MECHANISM  4)  should  also  have  a  'short-period' 
component.  Mechanisms  2  and  .1  are  time-varying,  but  not  dearly  short-period. 

The 't'  collection  to  position-precision  (MH<  IIANISM  5)  does  not  enter  time-varying  precisions.  These 
‘start  over’  with  the  time-series  ‘observations’  of  position,  (t  corrections  have  not  been  applied  tu  our 
time-varying  precisions.)  Calculating  effective  position-compom.nl  precisions  from  effective-precisions  of 
azimuth  and  elevation  (under  element  I  of  GEOMETRICAL  STATISTICS)  would  not  enter  time-varying 
precisions. 

Effective  tinte  attd-sync.  differences  among  station-observations  (under  element  4  of  GEOMETRICAL 
STATISTICS)  should  have  a  short-period  component.  Overcoming  this  through  Increase  in  sample-size  • 
inward  a  happy-medium  difference  -  would  he  doubly  reflected  in  actual  precisions  of  velocity  and 
acceleration,  (Indirectly  through  space-meusurement  and  directly  through  time-measurement.)  We  arc  not 
yet  propagating  time-precision  through  the  derivatives. 

Our  smoothing  interval  is  usually  constant  for  a  project.  Number-of-statlons  should  influence  the 
relationships, between  smwi tiling-interval  and  time-varying  precisions. 

EFFECT  ON  PHYSICAL  ACCURACY.  The  effect  of  number  of  observing  stations  on  flight 
measurement  accuracy  (bias)  is  a  suhject  for  further  investigation. 

I  am  willing  to  postulate  a  probabilistic  improvement  of  physical  accuracy  by  increasing  stations  in 
flight-measurement  •  because  we  thereby  increase  (lie  probability  of  mutual  compensation  of  station  biases 
in  both  magnitude  and  direction, 

Churchman  (Ref.  17)  notes  that  ’the  true  value  is  not  a  random  variate,  that  it  is  a  unique  element 
among  the  real  numbers,  and  that  the  probability  of  its  lying  iim/M*  interval  is  therefore  either  exactly  one 
or  exactly  zero.’  However,  such  absolute  knowledge  is  not  granted  us  Our  estimates  of  physically. true 
values  •  or  of  bias  therefrom  •  are  random  variates.  (In  tin*  sense  that  physical  info  is  unavoidably 
probabilistic.! 

System  precision  (data  precision)  is  a  collective  measure  of  the  mutual  calibration  of  its  stations  in 
space  and  time.  System  bias  (data  accuracy)  is  the  net  (uncompensated!  sum  of  its  station  biases  (in  space 
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and  time).  On  the  average,  improving  h"th  individual  and  mnmn!  stiitmn  calibrations  (station  accuracy  anti 
system  precision)  should  improve  the  net  calihialion  (system  accuracy)  more  than  would  individual 
calibrations  alone 

MECHANISM  2  treated  the  error  in  Figure  Id  as  a  dispersion  (or  precision)  index.  Let's  consider  it  a 
discrete  bias.  Then,  in  Figure  I0h,  il  the  discrete  angular  errors  happen  to  have  opposite  signs,  only  the 
baseline-normal  diagonal  of  the  smaller  utmost-diamond  in  Tig.  18  exists.  From  MECHANISM  2: 

i  sin  Aa 
sin  0/2 

And,  the  first  two  columns  of  the  first  table  under  MECHANISM  2  also  compare  the  accuracy  of 
perpendicular  convergences  lot  either  horizontal  or  vertical  planes.  In  Fig.  .10a.  if  the  discrete  angular  errors 
happen  to  have  the  same  sign,  only  the  baseline-parallel  diagonal  of  the  smaller  alniosl-diamond  in  Fig.  18 

cxis*s.  From  MECHANISM  2: 


r  sin  Aa 
^11  '  cos  d/2 

This  equation  is  meaningful  only  for  the  horizontal  plane, 

The  following  table  averages  the  accuracies  of  the  vertical  convergences  lor  the  (even)  chances  that 
station  biases  will  have  the  same  tit  opposite  signs'”: 

Vertical  Average  ,<1  ccuracy  Factors 

VERTICAL 
CONVERGENCE  (0) 

180" 

135® 

90° 

45° 

15° 

5° 


0.50 

0,54 

0.71 

1.31 

3.84 

11.5 

57.2 


The  above  values  are  half  those  of  the  first  table  under  MECHANISM  2.  (When  signs  are  opposite,  there  is 
no  net  bias.)  So,  vertical  convergences  rank  the  same  fin  average-accuracy  as  for  average-precision,  hui  the 
effect  of  a  given  departure  front  the  IMU"  optimum  is  only  half  as  great . 

. . 

Even  chance,  because  •  to  the  extern  bias  of  a  given-type  instrument  consistently  has  the  same  sign  it  is 
more  likely  to  be  adjusted  or  corrected  for. 
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1  lie  following  table  averages  t  he  accuracies  of  the  horizontal  convergences  for  the  (even)  chances  that 
station  buses  will  have  the  same  or  opposite  signs.  (In  each  case,  a  zero  enters  for  thcotlter  diagonal.) 


Horizontal  Average  Accuracy  factors 


PERPENDICULAR 
CONVERGENCE  {d) 

1 7')° 

175° 

165° 

1.15° 

•«)° 

45° 

15“ 

5° 

l" 


28.9 

6,0 

2.17 

0.92 

0.71 

0.92 

2.17 

6.0 

28.9 


Horizontal  convergences  rank  the  same  for  average-accuracy  us  for  nverage-procislon  (third  table  of 
MECHANISM  2),  hut  the  effect  of  a  given  departure  from  the  90“  optimum  is  less  than  half  asgreul. 

Still,  there  Is  plenty  of  potential  for  improving  position  accuracy  by  improving  the  overage  goodness  of 
intersection-angles  through  increase  In  sample-size,  Mechanisms  i  and  4  (Fig.  27),  and  our  time-measure- 
morn  possibility,  also  apply  to  accurucy  as  well  us  precision. 

SUMMARY.  Supplementing  the  ABSTRACT:  WSMR-slumlartl  precision  of  a  measured  component- 
vulue  reflects  the  agreement  among  its  station-observations  hi  that  point  in  time.  We  use  rms-uverage 
precisions  for  operating  and  management  control.  It  was  found  curves  of  average  precision  vs 
numher-of-ohservlng-stations  may  be  lilted  by  generalizing  the  exponential  dependence  of  the  standard- 
error-of-tho-mean  of  a  normal  distribution  on  sumple-sizc.  The  precision-response  of  clnetheodolite 
position-data  ranged  from  proportionality  to  N1  ,l  to  proporiionality  to  N3.  Five  mechanisms  seemingly 
Involved  In  this  profound  effect,  plus  an  open-ended  catchall,  are  summarized  In  Pig.  27.  (My  present  guess 
as  to  their  descending  magnitude:  2:  I;  5;  tie  between  3  and  4.)  This  Investigation  has  led  to 
mt’lhr»ts-improvement  suggestions  for  collection,  reduction,  und  quality-estimation.  A  marrluge  of 
geometry  and  statistics  has  been  partly  consummated,  on  simple  terms.  Previous  work  on  optimum 
convergence  (Ref.  6)  was  extended  to  quantitative  evaluation  of  the  precision  and  accuracy  of  'all  linear 
convergence-angles  -  for  measuring  the  vertical  and  horizontal  components  of  space-position.  It  appears  that 
if  wc  incorporated  slant-range  ahead  of  our  least -squares  estimate  of  position,  we  would  produce  more 
precise  and  accurate  data,  f Implicit  inequality  of  variances  in  angular  least-squares  apparently  does  more 
harm  than  if  this  inequality  were  minimized  by  linear  least -squares.)  A  sufficient  reason  for  using 
least-squares:  Even  when  uli  tire  rigorous  assumptions  of  the  Least-Squares  Principle  are  violated, 
least-squares  still  yields  the  minimum  vector-resultant  of  the  observed  errors.  A  method  was  given  for 
evaluating  our  assumptions  that  propagating  vaiiance  into  a  least-squares  position  component  yields  the 
standard-error  of  a  mean,  and  that  a  3-dlmensional  optimum  is  optimum  for  each  component.  It  was 
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suggested  the  ' t’  collection  to  individual  position-precisions  should  he  removed  before  averaging  these.  An 
approach  to  evaluating  the  effective  samplc-si?c  of  our  average-precisions  has  been  described.  The  procedure 
suggested  for  rating  measurmg-ability  of  test -configurations  can  be  set  up  with  trig.-table  ard  slide-rule,  and 
operated  with  an  adding-machine.  C omponents-of-position-measuring-cojfvariance  and  GDOP  (geometrical- 
dilution-of-precision)  were  presented  in  forms  easily  understood  and  used.  Some  elements  of  the 
boundary-discipline  of  geometrical  statistics  have  been  discussed.  A  way  was  suggested  of  taking  the 
physical  imbalance  of  degrees-of-freedom  between  azimuth  and  elevation  into  account  •  to  calculate  more 
valid  angular  and  linear-component  precisions.  Relationships  between  component-quality  and  vector-quality 
were  touched  on.  The  entire  paper  is  relevant  to  geodesy  as  well  as  flight-measurement. 
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Angular  LS 


For:  OM  -  ON  -  50,000  ft 
(5.6ftf  +  (18.7  ft)^  -  2(13.8# 
1-24.3  ft  rms  »  13.8  ft 


LS  ■  18.7-8.6  -  10. 1  ft 


Linear  LS 
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(8.6#  +  (8. 6  ftr  -  2(8.6  ft) 

Z "  17.2  ft  rms  ■  8.6  ft 
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-  38%  in  rms 

-  61%  in  ms 
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AVERAGE  PRECISION 


NUMBER  OF  STATIONS  IN  SOLUTION 
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Variance -ft  FIGURE  26 


FIGURE  27 


MECHANISMS  OF  IMPROVEMENT  OF 
FLIGHT  MEASUREMENT  PRECISION 
BY  INCREASING  STATIONS 


1.  BY  OVERCOMING  SMALL  SAMPLE-SIZE. 

2.  BY  OVERCOMING  NONOPTIMUM  ESI  CONFIGURATION. 

3.  BY  OVERCOMING  NONOPTIMUM  CHOICE  OF  VARIABLE  TO  BE 
OPTIMIZED. 

4.  BY  OVERCOMING  OUR  INABILITY  TO  OPTIMIZE  COORDINATE. 

5.  BY  OVERCOMING  NONOPTIMUM  ESTIMATE  OF  DEGREES -OF-FREEDOM. 

6.  BY  OVERCOMING  ERRORS  INCURRED  IN  APPLYING  STATISTICS  TO 
FLIGHT-MEASUREMENT. 
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AVERAGE  VELOCITY  PRECISON  -  ft/sec 
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cOrvirviEN  1 S  ON  FkESEN  i  A  i  iON 


BY  r  RtD  HANSON 


William  Kruskal 
Department  of  Statistics 
The  University  of  Chicago 
Chicago,  Illinois 


You  will  recall  that,  when  we  met  at  the  Edgewood  Arsenal  conference,  I  expremed  regret  that  I  could 
not  attend  the  session  at  which  you  presented  your  problem,  and  I  added  that  1  would  look  at  the  materials 
you  had  sent  to  me. 

Not  everything  in  those  materials  is  clear  to  me,  but  I  take  it  that  your  major  worry  is  that  the 
empirically  determined  standard  deviations  of  position  determinations,  as  a  function  of  number  of  stations, 
N,  decreases  faster  when  N  grows  than  the  N‘”  rate  that  would  be  expected  under  standard  circumstances. 
You  apparently  have  evidence  that  the  rate  is  more  like  N  '3 11 . 

The  first  thought  that  comes  to  mind  is  that  the  standard  N'^  rate  depends  squarely  on  the 
assumption  that  the  observations  are  uncorrelated  and  have  equal  variances.  In  particular,  if  the 
observations  have  equal  variances  but  negative  correlations,  then  the  standard  deviation  of  the  sample  mean 
is  feu  than  that  expected  under  the  standard  assumptlona. 

Let  me  make  this  specific.  Suppose,  for  simplicity,  that  we  are  dealing  with  N  random  variables,  all 
with  variance  a*  and  such  that  any  pair  of  variables  has  correlation  p.  It  is  a  standard  fact  that  p  cannot  be 
less  than  -  1/(N*1). 


Under  these  circumstances,  the  standard  deviation  of  X  (the  average  of  Xj)  is 


</J~£ 


(N-l)  p 


Suppose  that 


c  -  N3 
P  ~  NJ  (N-l) 


for  some  positive  constant  c.  Then,  substituting  back,  we  would  have  for  the  standard  dsviation  of  X, 


a 


It  seems  to  me  conceivable  that  something  like  the  above  may  be  talcing  place  for  your  radar 
measurements.  Suppose  that  a  measurement  error  comes  from  small  changes  in  the  angular  orientations  of 
the  object  measured.  Then  the  effect  of  such  a  small  change  on  one  radar  station  might  be  nearly  linearly 
related  to  the  effect  on  another  station,  and  with  a  negative  slope. 
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Of  course  this  is  all  speculation  because  I  do  not  understand  the  measurement  set-up  and  the  data 

reduction  method.  In  particular,  it  would  be  strange  for  p  to  depend  strongly  on  N. 


With  cinetheodolites,  it  is  hard  for  me  to  see  offhand  how  large  negative  correlations  could  be 
effective. 


More  basically,  it  is  not  clear  to  me  how  your  empirical  itandard  deviations  were  obtained.  1s  it 
possible  that  your  results  are  a  result  of  something  about  that  method? 
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EVALUATION  OF  NICKEL-IRON  AND  NICKEL-ZINC  BATTERIES 


Martin  J.  Sulkes 

Rower  Sources  Division,  Electronics  Components  Laboratory 
USAECOM,  Fort  Monmouth,  N,  J. 


Most  Army  communi cation  missions  requiring  secondary  batteries 
are  presently  being  met  by  the  nickel-cadmium  (Ni-Cd)  system,  with 
silver-zinc  (Ag-Zn)  filling  the  remainder  of  the  missions  that  require 
low  weight.  Both  of  these  systems  contain  expensive,  limited-supply 
materials.  Namely,  silver  at  $32/lb.  or  cadmium  at  $3. 25/lb. 

The  nickel-iron  (Ni-Fe)  and  nickel-zinc  (Ni-Zn)  systems  are 
potential  low  cost  replacements  for  the  Ni-Cd  and  Ag-Zn  systems,  since 
Zn  and  Fe  are  less  than  $0. 15/lb.  Ni-Fe  and  Ni-Zn  batteries  have  been 
known  for  many  years,  however,  until  the  present  they  have  not  developed 
the  energy  densities  and  life  of  which  they  are  theoretically  capable. 

For  example,  iron  has  a  theoretical  capacity  of  0.98  Ah/gm  compared  to 
0.47  Ah/gm  for  Cd  and  approximately  5X  higher  voltage.  However,  the 
Ni-Fe  (Edison)  cell  has  low  energy  density  (8  Wh/lb)  compared  to  12-15 
Wh/lb  for  Ni-Cd.  In  addition,  the  Edison  cells  low  temperature  and 
high  rate  performance  are  poor.  Its  cycle  life,  however,  is  excellent. 
Since  much  of  the  Edison  cells'  poor  performance  is  due  to  the  iron 
electrode,  an  Improved  iron  such  as  was  developed  by  GT  &  E  labs  could 
make  this  an  attractive  system. 

The  nickel-zinc  system  has  had  limited  cycle  life  because  of 
shorting  by  zinc  dendrites  and  loss  of  capacity  due  to  zinc  electrode 
shape  change.  Energy  density  has  been  limited  by  the  need  to  include 
a  large  excess  of  zinc.  Recent  work  on  Ag-Zn  batteries  and  fundamental 
investigations  of  the  zinc  electrode  have  indicated  how  dendrite  forma¬ 
tion  could  be  controlled  and  zinc  shape  change  reduced.  It  was,  therefore, 
estimated  that  through  the  use  of  an  improved  zinc  electrode  and  the  con¬ 
tractor's  high  energy  nickel  electrode  a  battery  capable  of  delivering 
up  to  30  Wh/lb  for  200  or  more  cycles  could  be  developed.  However,  a 
great  deal  of  investigation  of  the  various  interrelated  cell  construc¬ 
tion  factors  was  required  to  successfully  achieve  the  desired  goals. 

The  objective  of  this  work,  therefore,  was  to  optimize  a  design  for 
nickel-zinc  and  nickel-iron  and  evaluate  such  cells  in  standard  line 
configuration  as  possible  low  cost  replacements  for  existing  systems. 

A  comparison  of  the  discharge  curves  for  an  equal  weight  of  all  4 
electrochemical  systems  is  shown  in  Figure  1.  Specifically,  this 
evaluation  explored  the  construction  of  Ni-Fe  and  Ni-Zn  cells  for 
various  design  parameters,  and  tested  them  over  a  variety  of  rates  and 
temperatures . 

4 

The  nickel-iron  system  was  investigated  in  two,  2  -  design  2 

experiments,  while  the  nickel-zinc  system  wbb  studied  in  a  replicated  2J 
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experiment.  Each  experiment  had  16  cells.  In  all  cases,  the  assembly 
ot  cells,  electrolyte  till  and  position  on  various  tests  were  carried 
out  in  random  order  as  determined  from  a  table  of  random  numbers.  All 
experiments  were  analyzed  by  the  technique  of  multiple  linear  regression. 
The  calculations  were  made  on  a  Scientific  Data  Systems  930  computer 
using  the  Multiple  Linear  Regression  program  from  the  IEM  360  SSP 
(Scientific  Subroutine  Package).  Surprisingly,  when  the  calculations 
from  the  first  experiment  were  checked  by  the  Yates  method,  an  error 
was  discovered  in  the  IBM  program.  This  was  then  corrected  and  the 
results  obtained  through  manual  and  computer  calculation  were  then 
equal. 

The  multiple  linear  regression  technique  used  for  data  analysis 
assumes  that  the  total  response;  i.e.,  the  faradaic  capacity,  is  a 
linear  function  of  the  independent  variables  (factors)  being  studied. 

The  general  equation  Is 


n 

Y  ■  b  +  b .  X,  +  b,  X,  +  ...  +  b  X  •  b  +  T  b.  K, 
o  i  l  i  z  n  n  o  11 

i«i 

where  Y  is  the  dependent  variable  (response)  and  X^,  X^,  .  .  .  Xfl,  are 

the  factors  in  the  experimerital  study.  The  coefficients  b^,  b2  .  .  .  bR 

(partial  regression  coefficients)  were  determined  by  fitting  the  response 
data  to  the  general  equation.  Each  coefficient  then  became  an  effect 
value  and  an  indicator  of  the  effect  of  its  factor  on  the  total  response, 
independent  of  the  other  factors.  The  sign  of  the  coefficient  (+)  deter¬ 
mined  the  direction  of  the  effect  in  going  from  one  level  to  another  of 
the  factor.  The  constant  bQ  is  the  Intercept  on  the  Y  axis. 

The  first  nickel-iron  experiment  consisted  of  16  cells  made  with 
■  four  variable  construction  factors  each  at  two  levels  as  bhown  in 
Figure  2.  These  cells  were  given  a  total  of  16  charge-discharge  cycles. 
Based  on  the  pre-tested  capacity  of  the  positive  electrodes,  it  was 
expected  that  these  cells  would  have  a  nominal  capacity  of  6Ah  in  the 
normal,  positive  limiting  design.  However,  when  these  cells  were 
cycled,  lower  than  normal  capacities  were  obtained  after  several  cycles. 
This  low  capacity  was  traced  to  difficulties  in  control  of  the  chemical 
activation  process  for  the  iron  electrodes  used  in  these  cells.  To 
eliminate  this  problem,  the  next  experiments  were  assembled  with  iron 
electrodes  made  by  a  controlled  electrochemical  activation  process. 

Despite  this  setback  with  the  Experiment  1  cells,  valuable 
experience  was  gained  by  the  contractor  on  cell  assembly  techniques. 
Furthermore,  the  data  from  cycle  one  (Figure  3)  upon  statistical 
analysis  did  demonstrate  the  dependence  of  cell  capacity  upon  the 
variable  factors  studied  in  the  experiment.  This  analysis,  shown  in 
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Figure  4  was  run  after  non-significant  Interactions  were  eliminated 
from  the  analysis.  The  F  value  of  38.188  indicates  that  the  data  fit 
the  assumed  linear  relationship.  With  9  and  6  degrees  of  freedom  an 
F  value  exceeding  18.69  is  significant  at  the  0.999  probability  level. 
Since  the  computed  t  values  are  not  mutually  independent,  they  could 
only  be  used  for  ranking  the  order  of  Importance  of  the  variable  factors 
and  for  showing  the  direction  of  effect  of  the  variable  levels.  Thus, 
it  can  be  said  that  variable  C  (electrolyte  concentration)  with  a  computed 
t  value  of  11.544  had  the  major  effect  on  the  cell  capacity.  The  +  sign 
shows  that  the  high  level  (312  KOH)  of  this  variable  gives  more  capacity 
than  the  low  level  (212  KOH);  in  fact,  the  312  KOH  yielded  372  more 
capacity  than  the  212  KOH,  Figure  5  ranks  the  variables  in  order  of 
importance  and  shows  the  preferred  variable  level.  Similar  analyses 
of  data  from  later  cycles  of  the  cells  in  Experiment  1  gave  similar 
results,  thus  strengthening  these  conclusions.  However,  it  Bhould  be 
pointed  out  that  each  succeeding  cycle  is  not  independent  of  the  first 
cycle  data,  since  the  cell  construction  is  fixed. 

Preliminary  studies  of  the  dependence  of  cell  capacity  on  charge 
rate  and  Ah  input  indicated  that  higher  charge  rates  (C/2  to  C/4  were 
more  beneficial  and  more  efficient  than  low  charge  rates  (<  C/8).  Since 
the  cells  were  positive  limiting,  this  effect  is  a  function  of  the 
positive  plate,  verifying  previous  experience  with  positive- limiting 
nickel-iron  and  nickel-cadmium  batteries.  Further  studies  are  necessary 
to  determine  optimum  charging  conditions. 

A  second,  16  cell  nickel-iron  experiment  was  setup  in  accordance 
with  the  design  shown  in  Figure  6.  These  cells  all  contained  electro- 
chemlcally  activated  iron  electrodes  as  opposed  to  the  chemically 
activated  ones  in  the  first  Ni-Fe  experiment.  Because  of  the  change 
in  the  iron  electrodes,  it  was  thought  necessary  to  repeat  the  two  most 
significant  factors  found  in  experiment  1. 

A  total  of  8  charge-discharge  cycles  were  run  in  accordance  with 
the  regime  given  in  Figure  7  and  analyzed.  In  the  analysis  the  variable 
factors  and  their  first-order  interactions  were  the  independent  variables, 
while  the  Ah  capacity  was  the  dependent  variable.  In  addition,  percent 
capacity  retention  in  Ah  was  analyzed  by  comparing  the  Ah  capacities  on 
cycle  4  with  that  obtained  on  cycle  5  after  a  7  day  charged  stand. 

Figure  8  gives  the  actual  effect  values  of  the  various  factors  on  the 
dependent  variable  during  the  eight  cycles  run. 

It  la  apparent  that  LiOH  content  (D)  ,  KOH  concentration  (C)  and 
the  interaction  of  these  two  variables  have  the  greatest  effect  on  Ah 
capacity,  at  C/4  rates,  with  the  saturated  LiOH  better  than  no  LiOH  and 
312  KOH  better  than  212  KOH,  It  is  also  interesting  to  note  that  the 
charge  retention  cycle  ('5)  and  the  high  rate  cycle  (*  7)  disrupted 
the  relative  ranking  of  effects  on  subsequent  cycles.  Also,  with  respect 
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to  percentage  Ah  charge  retention  ('5).  the  KOH  concentration  was  the 
variable  with  the  major  contributing  effect. 

Therefore,  on  the  basis  of  the  two  nickel-iron  experiments  the 
following  design  features  were  chosen  for  the  optimum  Ni~Fe  cell 


design. 

1. 

Electrolyte  Concentration 

3 IX  KOH 

2. 

Additive 

L10H  saturated  electrolyte 

3. 

Electrode  Geometry 

End  plates  are  positives 

4. 

Separator 

Nylon-Cellophane-Nylon 

5. 

Electrode  Thickness 

0.037" 

These  factors  by  no  means  completely  define  the  system  and,  there¬ 
fore,  additional  experiments  will  have  to  be  run  to  determine  the 
influence  of  such  factors  as  positive  to  negative-capacity  ratios; 
and  quantity  of  overcharge  per  cycle  which  were  at  fixed  levels  in 
the  experiments. 

Nickel-zinc  Cell  Experiments 

One  experiment  was  run  to  date  on  Nl-Zn  cells.  The  design  of 
this  experiment,  shown  in  Figure  9,  contains  only  3  variable  factors, 
each  at  two  levels.  Replication  was  provided  since  it  was  known  that 
zinc  systems  are  more  erratic,  particularly  with  regard  to  cycle  life 
than  the  long-lived  Ni-Fe  or  Ni-Cd  system.  Therefore,  additional  cells 
are  required  to  achieve  more  reliable  data  analysis  and  also  to  provide 
for  substitute  cells  in  the  case  of  premature  failure. 

Seven  cycles  were  given  to  the  cells  in  Ni-Zn  experiment  1.  The 
first  cycle  analysis  Is  given  in  Figure  10.  An  F  value  exceeding  2.75 
is  significant  at  the  0.90  level,  and  a  value  exceeding  3.73  is  significant 
at  the  0.95  level.  Thus,  the  fit  of  the  Ni-Zn  first  cycle  data  to  the 
regression  curve  is  only  fair.  However,  if  non-significant  interactions 
are  eliminated  the  fit  is  greatly  Improved. 

The  computed  t  values  indicate  that  interaction  between  the  zinc 
electrode  substrate  thickness  and  electrode  geometry  (AB)  is  the  major 
contributor  to  the  ampere-hour  variation  observed.  Electrode  geometry 
(B)  is  the  second  most  important  contributor  with  the  15  negative,  14 
positive  cells  (B+)  producing  more  ampere-hour  capacity  than  the  14 
positive  cells  (B-) .  This  result  is  to  be  expected,  since  the  outer 
two  positive  electrodes  in  the  15  negative  cells  are  probably  more 
completely  utilized  and  would  show  as  Increased  capacity  in  these 
positive  limiting  cells.  The  zinc  electrode  substrate  thickness  (A) 
is  the  third  most  important  variable,  and  the  negative  t  value  shows 
that  the  low  level  <0.0025  inch  thick)  is  better  than  the  high  level 
(0.005  inch  thick)  of  this  variable.  The  excess  ZnO  variable 
apparently  had  little  effect  on  initial  Ah  capacity.  This  was 
expected  as  its  effect,  if  any,  was  more  likely  to  show  on  cycle  life. 
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The  large  interaction  effect  shown  in  Figure  1.1.  rrmiirAd  ■nm« 
study  to  explain,  since  the  factor  levels  should  not  have  been  suf¬ 
ficient  to  produce  changes  of  the  magnitude  found.  However,  since  the 
main  effect  of  this  Interaction  was  to  reduce  the  performance  below 
acceptable  level,  the  cause  had  to  be  determined  to  avoid  repeating 
it  in  future  designs.  Cell  teardown  analysis  determined  that  all 
cells  had  been  constructed  so  as  to  be  tight.  However,  with  the 
thick  substrate  (A+)  and  the  lower  number  of  negative  electrodes 
(B-)  there  was  an  insufficient  amount  of  compressible  zinc  to  pre¬ 
vent  excessive  tightness  in  the  cell, < which  was  responsible  for  the 
significant  reduction  in  cell  capacity. 

On  the  second  and  succeeding  test  cycles  on  the  Ni-Zn  experiment  1 
cells,  an  intermittent  internal  shorting  problem  became  apparent.  No 
further  data  is  presented,  since  this  shorting  problem  made  statistical 
data  analysis  unreliable.  The  shorts  were  particularly  evident  after 
a  seven-day  charge  retention  test.  Only  6  of  the  16  cells  showed  ap¬ 
preciable  charge  retention  (from  33  to  77  percent).  Examination  of 
the  internal  structure  of  the  shorted  cells  indicated  that  it  was 
caused  by  zinc  growth  at  the  top  edge  of  the  electrode  shorting  over 
to  the  adjacent  positive  electrodes.  In  future  experiments,  this  will 
be  corrected  in  three  ways:  (1)  coating  the  edges  of  zinc  electrodes 
with  an  inert  film  forming  agent;  (2)  additional  separator  height 
above  electrodes;  and,  (3)  less  initial  electrolyte  fill. 

Though  this  initial  experiment  did  very  little  toward  achieving 
optimization,  it  did  succeed  in  pointing  out  several  critical  design 
parameters  that  must  be  considered  before  satisfactory  performance 
can  be  obtained  in  a  high  energy  density  Ni-Zn  cell.  Two  additional 
design  experiments  are  planned  to  evaluate  such  construction  factors 
as  the  negative  to  positive  capacity  ratios ,  the  total  number  of  plates 
(plate  thickness),  separator  type  and  number  of  layers,  amount  of 
amalgamation  of  the  negative,  etc.  These  factors  must  all  be  explored 
before  a  Ni-Zn  battery  meeting  the  required  goals  can  be  fielded. 

In  Summary:  The  use  of  factorial  design  experiments  has  greatly 
reduced  the  number  of  cells  required  for  the  evaluation  of  these  two 
electrochemical  systems.  This  reduction  in  the  number  of  cells  is 
particularly  Important  for  secondary  batteries,  since  by  their  nature, 
each  cell  can  tie  up  testing  space  for  many  months  as  it  repeatedly 
cycles.  Several  important  design  factors  have  been  optimized  for  both 
systems  though  much  more  work  remains. 

This  work  was  carried  out  by  General  Telephone  and  Electronics 
Laboratories,  Inc,,  under  Contract  DAAB07-68-C-0102.  Complete  data 
for  the  experiments  reported  on,  may  be  found  in  R  &  D  Technical  Report 
EC0M-0102-1  by  Mr,  T.  Blickwedel  of  Gl&E  Labs  issued  in  September  1968. 
The  suggestions  and  assistance  of  Mr.  Joseph  Weinstein  of  the  Electronic 
Components  Laboratory,  USAECOM  is  gratefully  acknowledged. 
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N;-FsJ  Experiment  1  —  Factorial  Design 
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Figure  4 

Multiple  Linear  Regression  Analysis 


Deviation  from  Regression  9  0.747  0.083 

TOTAL  15  19.776 


Figure  6 

Ni-Fe  Experiment  2  —  Factorial  Design 
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Ni-Fe  Experiment  2  -  -  Magnitude  of  effect  of  the  Independent  Variables  on  the  Dependent  Variable 


Ni-Zn  Experiment  I  -  -  Factorial  Design 


m 

c 

<  c 
—  o  , . 

rtl  •  — 

.a  is  ~ 

c 

cn 


^  £  £  O  U 

'-‘TOTOja.ararooofotD 


U  U  £!  n 
-Q  -Q  TO  <tJ 


C  3 

ss-i 

on*  *'• 

S.^ 
s  S 


*’J  1111“ 


k-  CO  '  > 

o 


—  E  —  (\jw^-ir\>0(^ooo 
a>  3 
CJ  z 


o  —  c\j  m  tr\  so 


290 


Figure  10 

MULTIPLE  REGRESSION  -  -  Ni-Zn  Experiment  I.  Cycle  I 
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DESIGN  OF  EXPERIMENTS  AND  A  STATISTICAL  PERFORMANCE  MODEL 
FOR  A  RADAR  ALTIMETER 

Erwin  Biser 

Avionics  Laboratory,  U.  S.  Army  Electronics  Command 
Fort  Monmouth,  New  Jersey 


GLOSSARY  OF  TERMS  AND  SYMBOLS. 

Number  of  observations 

)  Number  of  positive  deviations 

)  Number  of  negative  deviations 

Height  observation  measured  by  Honeywell  Altimeter, 
Model  7091-A  (Modified)}  Test  Item  1 

Height  observation  measured  by  AN/APN-22  Altimeter; 
Test  Item  2 

Reference  height  observation  measured  by  RCA  Laser 
Range-Finder  AN/GVS-1  (XE-6) 

-  H  -  Hn 
m  R 


Average  deviation  of  the  RCA  Laser  Range-Finder  AN/GVS-1 
(XE-6)  height  observations  from  the  Honeywell  Altimeter, 

Model  7091-A  (Modified)  height  observations 

Average  deviation  of  the  RCA  Laser  Range-Finder  AN/GVS-1 
(XE-6)  height  observations  from  the  AN/APN-22  Altimeter 
height  observations 

Angle  of  pitch  measured  by  the  attitude  indicator  from  the 
vertical  (90s)  as  established  by  the  Honeywell  Vertical  Gyro; 
positive  angles  of  pitch  indicate  the  nose  of  the  aircraft  is 
up;  negative  angles  cf  pitch  indicate  the  nose  of  the  aircraft 
is  down. 


Angle  of  roll  measured  by  the  attitude  indicator  from  the 
vertical  (90°)  as  established  by  the  Honeywell  Vertical  Gyro; 
positive  angles  of  roll  indicate  the  aircraft  rolling  to  the 
right;  negative  angles  of  roll  indicate  the  aircraft  rolling 
to  the  left. 


29  3 


»iyt*k/u  v»  *.  v«  it.  pxev.1t  dii^xc  uyac  L  v  OCX  VI IO  A  x '  7  U»  Lilts 

**  vertical  (90°)  as  established  by  the  Honeywell  Vertical  Gyro 

0  Average  deviation  ot  the  ro1!  angle  observations  from  the 

vertical  (90°)  as  established  by  the  Honeywell  Vertical  Gyri 

s  Sample  standard  error  of  the  deviation'!  of  the  angle  of  pitch 

(9  )  observations  from  the  vertical  (90°)  as  established  by  the 

Honeywell  Vertical  Gyro 

s.^  .  Sample  standard  error  of  the  deviations  of  the  angle  of  roll 

'  r  observations  from  the  vertical  (90°)  as  established  by  the 

Honeywell  Vertical  Gyro 

g 

(dmR)  Sample  standard  error  of  the  deviations  of  the  RCA  Laser 
Range-Finder  measured  observations  from  the  7091-A 
Altimeter  (Modified)  measured  observations 

s(doR)  Sample  standard  error  of  the  deviations  of  the  RCA  Laser 

0  Range-Finder  measured  observations  from  the  AN/APN-22 

Altimeter  measured  observations 

NT(d)  Total  number  of  positive  and  negative  deviations 
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BACKGROUND .  Experiments  conducted  in  1963-4  in  the  Arctic  region 
by  the  U.  S,  Army  Electronics  Command  Avionics  laboratory  have  confirmed 
that  the  electrical  properties  of  polar  ice  and  snow  Jo,  in  fad,  cause 
microwave  frequencies  to  suffer  high  surface  reflection  losses,  and  low 
transmission  losses  within  the  medium.  These  results  correlate  well 
with  the  theoretical  predictions.  Specifically,  these  experiments  re¬ 
vealed  that,  for  standard  4.3GHz  radar  altimeter  frequencies,  normally 
incident  electromagnetic  waves  impinging  upon  essentially  uncontaminated 
snow  surfaces  effectively  penetrate  the  snow/ice  media  to  depths  of 
several  hundred  feet.  In  many  instances,  sub-surface  interfaces  provide 
signal  reflections  which  are  of  substantial  amplitude  and  are  readily 
detectable  at  the  radar  altimeter  receiver.  These  sub-surface  reflections 
can  be  as  much  as  20  db  stronger  than  the  surface  reflections. 

Findings  further  showed  that  radar  altimeters  employing  nanosecond 
pulse  leading-edge-tracking  techniques  are  significantly  more  accurate 
than  those  utilizing  frequency  modulation.  The  accuracies  of  these 
techniques  differed  greatly  because  the  pulse  system  measured  altitude 
from  the  closest  terrain  surface  (the  leading  edge  of  the  reflected 
RF  pulse)  whereas  the  FM-CW  system  integrated  all  the  surface  and  sub¬ 
surface  signal  returns,  with  no  discrimination  against  the  more  distant 
radar  echoes. 

Specifically,  CW  altimeter  errors  as  great  as  150  feet  were 
recorded  for  an  actual  altitude  of  300  feet.  Pulse  altimeter  errors 
were  considered  negligible;  in  fact,  they  were  not  measurable  since 
they  did  not  exceed  the  Instrumentation  error  inherent  in  the  experiment. 

In  April  1967,  Research  and  Development  personnel  of  the  U.  S.  Army 
Electronics  Command  Avionics  Laboratory  conducted  radar  altimeter  tests 
in  a  Choctaw  CH-34C  helicopter  over  the  three-story  high  rain  forests 
of  the  Panama  Canal  Zone.  These  tests,  the  first  of  their  kind,  were 
made  possible  through  the  use  of  specially  designed  instrumentation 
including  an  air-portable  range  finder  with  a  height-measuring  accuracy 
of  one  meter. 

During  the  tests  ,  altimeter  data  was  continuously  gathered  and 
recorded  while  the  project  aircraft  was  flown  1000  miles  over  dense 
jungle ■  The  project  personnel  previously  conducted  experiments  in 
Greenland,  which  first  showed  the  unique  potential  of  this  nanosecond 
pulse  radar,  with  its  leading-edge-tracking  technique,  for  providing 
accurate  height  measurements  over  deep  ice  and  snow  of  the  Arctic 
vgion. 


While  most  radar  altimeters  provide  relatively  accurate  height 
information  over  large,  flat  airstrips  they  typically  become  highly 
unreliable  and  grossly  in  error  when  employed  over  varying  terrain  such 
as  deep  polar  ice  and  snow  or  high  jungle  foliage. 
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In  addition  to  providing  radar  altimeter  performance  data  not 
heretofore  obtained,  the  technical  information  resulting  from  these 
tests  proved  a  significant  factor  in  selection  of  radar  altimeter  design 
techniques  most  suitable  for  Army  aircraft  applications.  More  recently, 
procurement  of  the  AN/APN-171  Radar  Altimeter,  employing  this  recommended 
design  concept,  has  been  initiated  for  Mohawk  aircraft  applications. 

1.  INTRODUCTION.  Through  the  support  of  USATTC  (USA  Tropic  Test 
Center),  Ft.  Clayton,  Canal  Zone,  including  the  Army  Aviation  Detachment 
at  Albrook  AFB,  radar  altimeter  tests  were  conducted  over  the  high  jungle 
canopy  in  the  vicinity  of  Rio  Chagres  and  Rio  Pina,  with  three  flights 

on  April  4,  5,  and  7.  The  tests  were  conducted  in  accordance  with  the 
procedure  and  objective  as  stated  in  the  test  plan  prepared  by  the 
Office  of  Operations  Research  entitled:  Design  of  Experiments  for 
Radar  Altimeter  Techniques  at  the  Tropical  Test  Center,  Panama  Canal 
Zone .  Data  was  obtained  from  16  hours  of  flight  time  at  altitudes  of 
600  feet  and  1000  feet,  using  a  CH-34C  helicopter  bearing  tail  #34508. 

Altogether,  approximately  500  bits  of  data  were  obtained,  each 
representing  a  comparison  of  the  indicated  attitude  of  one  of  the  test 
item  radar  altimeters  with  actual  aircraft  height  measured  through  use 
of  a  precise  laser  distance  measuring  equipment  with  a  1-meter  accuracy. 

2.  SUMMARY  AND  CONCLUSIONS.  It  is  to  be  understood  that  these 
conclusions  are  primarily  statistical  in  character,  and  hence,  are 
(statistical)  inferences  drawn  from  the  evidence  based  solely  on  the 
observations.  Furthermore,  the  observations  in  this  analysis  are 
deviations  of  measurements  from  the  reference  measurements  of  altitude, 
pitch,  and  roll. 

'll 

The  following  conclusions  emerge  from  this  analysis : 

2.1  The  7091-A  Altimeter  observations  were  predominantly  negative 
and,  hence,  the  readings  were  consistently  less  than  the  respective  RCA 
Laser  reference  readings  on  all  fllghtB  (see  Table  2). 

2.2  The  AN/APN-22  Altimeter  readings  were  predominantly  positive 
and,  hence,  the  readings  were  consistently  greater  than  those  of  the 
respective  RCA  Laser  reference  readings  with  the  exception  of  readings 
taken  at  a  height  of  600  feet  and  a  velocity  of  70  knots  (see  Table  2). 

The  following  plausible  explanation  for  conclusions  2.1  and  2.2  is 
offered  by  the  Project  Engineer:  It  appears  that  the  narrow  (one  milli- 
radian)  laser  beam  penetrated  some  appreciable  distance  through  openings 
in  the  rain  forest  canopy  before  striking  the  uppermost  foliage  layer. 

2.3  The  analysis  of  variance  technique  shows  that  the  population 
means  of  the  test  items,  namely  7091-A  and  AN/APN-22  Altimeters,  are 
significantly  different  at  a  level  of  signifies- ace  of  .01  on  all  flights 
for  which  data  were  obtained. 
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2.4  The  standard  deviation  of  pitch  angle  observations  taken 
during  the  use  of  the  AN/APN-22  Altimeter  were  consistently  less  than 
the  respective  standard  deviations  of  pitch  angle  observations  taken 
during  use  of  the  7091-A  Altimeter  (see  Table  3). 

2.5  For  the  combined  positive  and  negative  observations  at  a 
at  a  height  of  1000  feet  at  both  velocities,  the  absolute  magnitudes 

of  the  means  and  the  standard  deviations  of  the  7091-A  observations  are 
consistently  less  than  the  respective  means  (absolute  magnitudes)  and 
standard  deviations  of  the  AN/APN-22  observations.  The  opposite  results 
are  obtained  at  a  height  of  600  feet  and  at  a  velocity  of  70  knots  (see 
Table  1). 
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No  experimentation  was  performed  with  the  AN/APN-22  Altimeter  at  a  height  of  600  feet  anc. 

volooitv  of  fSS+fil  WnotQ 


I 
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NOTE:  No  experimentation  was  performed  with  the  AN/APN-22  Altimeter  at  a  height  of  600  feet 
velocity  of  (55±5)  knots. 
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NOTE:  No  experimentation  was  performed  with  the  AN/APN-22  Altimeter  at  a  height  of  BOO  feet  and  a 
velocity  of  (55±5)  knots. 


GRAPHS  of  OBSERVATIONS 
of  the 

7091 -A  and  AN/APN-22  ALTIMETERS 
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FREQUENCY  COMPARISON  OF  ALTIMETER  DEVIATION'S 
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3. 


DESIGN  OF  EXPERIMENT. 


3.1  Objective  of  Experiment:  The  objective  of  the  experimental 
test  is  to  evaluate  the  accuracy  of  the  Honeywell  Model  7091-A  Altimeter 
(Modified)  in  a  tropical  zone,  utilizing  the  nanosecond  pulse  leading- 
edge-tracking  technique. 

3.2  Test  Item  1:  The  Honeywell  Model  7091-A  Altimeter  (Modified) 
utilizing  the  nanosecond  pulse,  leading-edge-tracking  technique  appeared 
capable  of  reasonably  good  accuracy  in  previous  tests  in  the  temperate 
and  arctic  zones.  (See  "Radar  Altimeter  Techniques  in  the  Arctic  Environ¬ 
ment". ..R.  J.  Lucas  &  R.  C.  Cruickshank,  Presentation  at  1966  Meeting  of 
the  AGARD  Avionics  Panel  (MATO),  Avionics  Laboratory,  USAECOM,  Ft.  Monmouth, 
N.  J.)  Thus,  the  same  altimeter  was  selected  for  testing  to  determine  the 
accuracy  of  its  design  technique  in  a  tropical  zone.  The  Honeywell  Model 
7091-A  Altimeter  (Modified)  will  be  referred  to  as  Test  Item  1. 

3.3  Test  Item  2:  In  the  statistical  analysis  of  this  experiment, 
the  accuracy  of  the  AN/APN-22  Altimeter,  utilizing  a  frequency  modulation, 
continuous-wave  design  technique  1b  compared  with  the  accuracy  of  the 
Honeywell  Model  7091-A  Altimeter  (Modified).  The  AN/APN-22  will  be 
referred  to  as  Test  Item  2. 

3.4  Standard  of  Reference:  The  standard  of  reference  for  evaluating 
and  comparing  the  accuracies  of  the  two  test  items  is  the  RCA  Laser 
Range-Finder  AN/GVS-l(XE-6)  which  has  a  one  (1)  meter  error  (one  sigma). 

The  test  items  and  the  instrumentation  were  installed  in  the  CH-34C 
(CHOCTAW),  a  helicopter  capable  of  seating  twelve  people. 

3.5  Measured  Observations;  The  measured  observations  consisted  of 
height  readings  above  the  closest  foliage  at  height  levels  of  600  feet 
and  1000  feet  for  the  following  pieces  of  equipment: 

a.  Honeywell  Model  7091-A  Altimeter  (Modified) 

(1)  The  observations  ware  measured  in  feet 

(2)  The  measured  observations  of  height  using  the 
Honeywell  Model  7091-A  Altimeter  are  syntoolized 

b*  V 

b.  AN/APN-22  Altimeter 

(1)  The  observations  were  measured  in  feet 

(2)  The  measured  observations  of  height  using  the 
AN/APN-22  Altimeter  are  symbolized  by  Hq. 

c.  RCA  Laser  Range-Finder  AN/GVS-l(XE-6) 

(1)  The  observations  of  height  from  this  piece  of 

equipment  serve  as  a  standard  reference  to  deter¬ 
mine  the  deviations  of  height  for  both  the  Honeywell 
Model  7091-A  Altimeter  and  the  AN/APN-22  Altimeter. 
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(2)  The  measured  observations  of  height  using  this 
piece  of  eauinmenf-  ar*  ay^bollzed  by 

(3)  The  observations  were  measured  in  meters  and 
converted  to  feet  in  the  calculations. 


(4)  The  Laser  Range-Finder  has  an  accuracy  of  +  1  meter 
(■  1  sigma) . 


(5)  Environmental  specifications: 


1  o  error  - 


where 


5  -  [5  “ 


+  3SH  +  5* 


(-M 

at  .1 


is  the  altitude  rate. 


(6)  It  was  intended  to  obtain  measured  observations 
of  height  at  the  following  levels  of  height: 


H1  H2  «3 


H5  H6 


400  600  800  1000  1200  1400 


However,  the  data  was  obtained  only  at  the  600  ft. 
and  1000  ft.  height  levels  in  the  actual  experimenta¬ 
tion.  Data  was  recorded  at  the  rate  of  one  observa¬ 
tion  every  minute  at  the  600  ft.  and  1000  ft.  height 
levels, 

Th“  means  and  standard  deviations  of  d  R  were  computed 
and  also  the  number  (N)  of  positive  ana  negative 
deviations  from  H  : 

K 

H,  -  600  ft. 


dmR  "  VHR 


Controlled  Parameters  (constraints) : 


3.6 


a.  Velocity  of  Aircraft  (CH-34C) 


(1) 

(55+5)  knots 

(2) 

(70+5)  knots 

(3) 

The  quantal  error  is  5; 

i.e.,  the  velocity  is 

between  50  and  60  knots 

or  between  65  and  75 

knots . 

b.  Aircraft  Attitude 


(1)  An  attempt  was  made  to  maintain  the  aircraft 
attitude  to  within  +2®  from  the  vertical  (90®); 
i.e. ,  the  deviations  of  pitch  angle  (6^)  and 

the  deviations  of  roll  angle  (0  )  should  each 
be  within  +2®  of  the  vertical.  r 

(2)  For  each  RCA  Laser  Range-Finder  measurement 

taken,  corresponding  measurements  of  6  and 
0r  were  taken.  p 

(3)  A  positive  measurement  of  8^  indicates  the  nose 

of  the  aircraft  is  in  an  upward  position;  a 
negative  measurement  of  0p  indicates  the  nose 

of  the  aircraft  is  in  a  downward  position. 


(4)  A  positive  measurement  of  6r  indicates  the 

aircraft  Is  rolling  to  the  right;  a  negative 
measurement  of  8r  Indicates  the  aircraft  is 

rolling  to  the  left. 

(5)  It  was  intended  to  select  from  the  data  a  set  of 

observations  H  corresponding  to  the  low  attitude 
in 

angles  8^  and  6f  arranged  in  order  of  magnitude 
for  each  level  of  height  (H) : 


Hx  -  600  ft. 


8 

P 

H 

m 

6r 

2° 

2° 

2.5® 

• 

• 

s 

• 

• 

a 

• 

(Likewise  for 
h2  -  1000  ft.) 
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c.  Aircraft  Helshf  CH)  -  of  height  arc  600  ft. 

and  1000  ft.  However,  it  was  Intended  to  use  height 
levels  of  400  ft.  to  1400  ft.  in  steps  of  200  ft. 

3.7  Comparison  Between  Test  Items  1  and  2:  For  each  of  the 
two  levels  of  H,  height  readings  were  to  be  taken  as  recorded  from 
the  AN/APN-22.  These  will  be  compared  with  the  output  of  7091-A  (Test 
Item  1)  and,  of  course,  with  the  Standard  of  Reference,  the  RCA  Laser: 


Hx  -  600  ft. 


LASER 

7091-A 

AN/APN-22 
(old  equipment) 

H_ 

H 

H 

d  „ 

d  „ 

R 

m 

0 

oR 

• 

mR 

• 

• 

« 

• 

• 

« 

• 

• 

s 

s 

Where  H  ■  Height  recorded  by  old  equipment 
°  (AN/APN-22  Altimeter) 

The  standard  deviations  of  the  observations  from  the  reference 
data  was  computed  to  obtain  the  distribution  of  the  errors.  However, 
in  the  actual  experimentation,  data  was  obtained  only  at  the  600  ft. 
and  1000  ft.  levels  of  height. 

3.8  List  of  Equipments: 

a.  Equipment  Items  to  be  Tested: 

(1)  Test  Item  1  -  Honeywell  Model  7091-A 

Altimeter  (Modified) 

(2)  Test  Item  2  -  AN/APN-22  Altimeter 

b.  Test  Instrumentation  and  Accuracies: 

(1)  Meter  (1  o)  RCA  Laser  Range-Finder  AN/GVS-1  (XE-6)  - 
the  output  of  the  Laser  is  recorded  on  a  decimal 
drum  readout  in  digital  form. 

(2)  Five-foot  (5  ft.)  Recorder,  Mark  280,  Brush 
(Precision  Servo  Penmotor  Recorder) ,  2-Channel 
with  two  events  channels 
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(a)  Two  DC  analog  channels  recorded  test  item 
altimeters :  heignt  data 

(b)  One  event  mark  channel  were  synchronized  with 
the  laser-firing  (once  every  minute) . 

(3)  0.5s  1  o  Vertical  Gyro,  Cageable,  Honeywell  Mfr. 

Part  No.  JG  7044  A-35,  SN.04 

(a)  Used  to  establish  vertical  standard  for 
measurements  of  the  deviations  of  pitch 
angle  and  roll  angle  from  the  vertical 
(90°) 

(b)  Pitch  attitude  deviations  recorded  as  positive 
or  negative 

(c)  Roll  attitude  deviations  recorded  as  positive 
or  negative 

(d)  The  above  gyro  outputs  are  displayed  on  a 
zero-center  meter  and  are  recorded  with  each 
laser  firing  on  the  decimal  drum  readout. 

(4)  TS-352/U  Multimeter,  Tektronix  Model  422  scope,  HP 

Model  G382A  Variable  Attenuator  (Precision) .  The 

HP  Model  is  used  to  check  the  sensitivity  (loop 

gain)  of  the  Test  Items. 

NOTE:  It  is  to  be  emphasized  that  velocity  was  not 
treated  as  a  factor,  since  the  radar  response  (with 
1000  pulses  per  seconds)  would  not  be  affected  by 
velocities  below  300-400  knots.  This  is  the  reason 
that  no  Interactions  were  computed. 


*The  remainder  of  this  paper  was  reproduced  photographically  from  the 
author’s  manuscript. 


4  Analysis  01'  Variance  Computations: 

(For  Testing  the  ly  potties  is  of  Equal  Means  Between  Test 
Item  Height  Observations' 


Flight  H3 


Height  »  GOO  fccc _ Velocity  (70*5)  knots 


7001  -  A  (Test  Item  1) 
n^  =  4h  25  =  29 
Xj  s  -979.  0 


*i. 


hi 


-979.  0  =  -33.  75  feet 
29 


Z  Xs,,  =  57517.  12 
1  ij 


AK/AFN-22  (Test  Item  2) 

n8  =  35  +  35  =  70 

Xa.  =  20.  6 

Xa.  =  20,  6  =  .  29  feet 
70 
n» 


PV 


=  56241. 78 


T  ■  N.  =  29  +  70  =  99 

X..  =  -979.  0  +  20.  6  =  -9.68 
99 

k  nj 

Z  Z  Xs  =  57517.  12  +  56241.  78  *  113758.  90 
l  j  ij 

TX*. .  *  90<-9.  68)s  =  9276.54 

Sum  of  Squares  Between  Groups  (SSB): 
k 

SSB  -  r  _  v8  _  'T'V*  .  00/-OQ  . 


*  Z  njX^  -  TX*  .  «  2 9( -33 .  75)s  +  70(.  29)8  -  9276.  54  *  23762. 16 


Sum  of  Squares  Within  Groups  (SSW): 
k  n> 


SSW  =  f  I  X8^  -  f  njXsL  =  113758.90  -  [29(-33.  75)s  +  70(.  29)*]  =80720. 


Total  Sum  of  Squares  (SST): 
k  n< 


SST  =  jp  Z  Xsjj  -  TX3. 


113758.  90  -  99{-9,  63)a  =  104482.  36 


Source  of 
Variation 

df 

SS 

MS 

Between 

Groups 

1 

23762. 16 

23762. 16 

Within 

Groups 

97 

80720.20 

832. 17 

Total 

98 

104482. 36 

Fcomputed^.9YJ  =  MI§24i=28-55 
832.17 


F  ggOjOO)  =  6.90  (tabularvalue  of  F -distribution) 

Since  28.  55  »  6.90,  the  population  means  are  highly  signif¬ 
icantly  different  at  a  significance  level  of  ■  01  . 
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Flight  H  3 _ Height «»  1000  feet  Velocity  (70i5)  knots 


7091' 

-A  (Test  Item  1) 

AN/APN-22  (Test  Item  2) 

n»  « 

7  +  45  =  52 

na  ■= 

53  +  5  =  58 

X>.  ' 

-1517.20 

x*  - 

3563. 90 

Xi.  - 

-1517.  20  *  -29.  18  feet 

X..  -  . 

3563.  90  -  61.  45  feet 

52 

58 

«»  , 

na 

L  Xs 

i 

ij  -  82670.  32 

S  X8 
j  ij 

-  291629.06 

k 

P 


i 

i 


T  ■  N,  =  52  +  58  *=  110 

X..  -  -1517.20  +  3563.  90  «  18.61 
110 

X8^  =  82670.  32  +  291629.  06  =  374299.  38 


TX8..  -  110(18.  61  )8  -  38096.30 


Sum  of  Squares  Between  Groups  (SSB): 

k 

SSB  -  njX8  -  TX8. .  «  52  (-29. 18  >8  +  58  (  61.  45)8  -  38096.  30  .  225193. 94 


Sum  of  Squares  Within  Groups  (SSW): 
k  ni  k 

SSW  ■  £  ^  X*ij  ‘  f  niX'i  *374299‘  38  "  263290.  24  »  111009. 14 


Total  Sum  of  Squares  (SST): 
k 

SST  ■  f  £  X8^  -  TX8..  »  374299.38  -  38096.30  -  336203.08 


df 

SS 

mm 

Between 

Groups 

mm 

225193.94 

108 

111009. 14 

1027.  86 

Total 

109 

336203.  08 

■Umpu«a“-“18  ’ 


225193.  94  =  219.09 
1027.  86 


F  ggdt  *00)  *  00  (tabular  value  of  F-distribution) 


Since  219.09  »  6.90  .  the  population  means  are 
highly  significantly  different  at  a  significance  level  of  .01. 
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nifiht  _ Height  *  1000  feet  Velocity  (55*5)  knr,t« 

7001 -A  (TeM  J  lemj  1  AN/APN-22  (Test  Item  2} 

nt  =  12  +  79  =  91  n,  *  78  13  »  91 

Xj.  =  -3135.00  Xg,  ‘  3374,80 

*  -34.  45  feet  Xt.  .  3374.80  .  37.09  feet 

81  91 

"»  a  "» 

J'  Xij  *  180825. 10  E  X8^  «  239607.26 

TsN  =  91  +  91  *  182 

l. .  =  -3135,  0  +  3374.  8  „  j  32 
182 

k 

E  E  X*j  »  180825. 10  +  239507.  28  *  420332.  36 
TX*..  =  182  (1.32)*  =  317.12 


Sum  of  Squares  Between  Groups  (SSB): 
k 

SSB  ■  E  n^j®  -  TX®..  «  91  (-31.45)*  +  91  (37.09)'  -  317.12  . 

Sum  of  Squares  Within  Groups  (SSW): 
k  nj  k 

SSW  -  IE  Xjj®  -  EnjXj®  .  420332.36  -  233184.  82  «  187147.  54 

Total  Sum  of  Squares  (SST): 
k  n4 

SST  -  E  I  Xjj®  -  TX'. .  «  420332.  36  -  317.  12  »  420015.24 


Source  of 
Variation 

df 

SS 

MS 

Between 

Groups 

1 

232867.  70 

232867.  70 

Within 

Groups 

180 

187147,  54 

1039.  71 

Total 

181 

420015.24 

F  (1..180)  1  232867.  70  =  223.  97 
computed  1039.  71 


F  ggt  1 ,  150)  «  6,81  (tabular  value  of  F-distribution) 

Since  223,  97»6.  81  ,  the  population  means  are 

highly  significantly  different  at  a  significance  level  of  .  01 


232867.  70 
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5. 


1-  T e.-il  Computations  : 


Testing  the  Hypothesis  of  Kauai  Means  Between  Test  Item 
Height  Observations 

Only  negative  deviations  for  the  7091 -A  Altimeter  and  positive  deviations 
for  the  AN/APN-22  Altimeter  were  used  in  the  following  calculations  be¬ 
cause  of  their  predominant  occurrence  In  the  data. 


Flight  #3 _ Height  «  600  feet 

7001-A  (Test  Item  1) 


nj  *  25 
Xj  =  -40.96 
S!  =  24.  15 


Velocity:  ( 70*5)  knots 
AN/APN-22  (Test  Item  2) 
na  =  35 
Xa  -  22.  61 
sa  *  16.  82 


One-Sided  Test: 


t*  <Xt 


v  v  /  n1na(ni+na-2) 

'  *  v  In^n*)  (njS^'+ngSs'T 


t  *  [-40.96  -  22.61]  * 


I  (25x35)(2  5+35-2) _ 

V{25+35)[25(24. 1 5)*+35(16.  82 )2  ] 


t  =  -20.91 


For  a  significance  level  of  a  =  .  05,  the  tabular  value  of  the 
t-distribution  table  is:  tj.^QjtSO)  *  t  0g(5O)  =  1.  67  . 

Since  -20.  91  <<  -1.  87,  the  population  means  of  the  test  items 
are  highly  significantly  different  at  a  -  .  03  level  of  significance. 

Furthermore,  for  a  significance  level  of  a  *  .  005,  the  tabular 
value  of  the  t-distributlon  table  is:  tj.  oq5(50)  =  t  995(50)  *  2.68  . 

Therefore,  since  -20.91  <<  -2.68,  the  population  means 
of  the  test  items  are  also  highly  significantly  different  at  a  =  ,005 
level  of  significance. 


Two-Sided  Test: 

For  a  significance  level  of  a  =  .  05,  the  tabular  value  of  the 
t-dlstribution  table  Is:  tj_  qij(50)  =  t  975(50)  =2.01 

2 

Since!  t  |=  20.  91>>2,  01,  the  population  means  of  the  test 
items  are  highly  sign iflcantly  different  at  a  =  .  05  level  of  significance . 

Furthermore,  for  a  significance  level  of  a=  .01,  the  tabular 
value  of  the  t-distrlbution  table  is:  tj^.  qj(50)  =  t  g0g(5O)  ■  2.  68 

2 

Since  |  t  |=  20.  91>> 2.  68,  the  population  means  of  the  test 
items  are  also  highly  significantly  different  at  a  -  .01  level  o_f 
significance. 
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Flight  02 


Height  1000  feet 


Velocity:  (55J-.5)  knots 


7091 -A  (Test  Item  1) 
nj  =  79 
Xt  *  -40.96 
St  *  24.  35 

One-Sided  Test: 

t  *  [-40.  96-46.  73  > 
t=  -20.80 


AN/APN-22  (Test  Item  2) 
na  =  78 
5(a  =  46.  73 
sa  =  28.  01 


(79x78)(79+78-2) 


v  (79+78)[79(24,  35)=+78(28.  01)3  1 


For  a  significance  level  of  o'*  05,  the  tabular  value  of  the 

t-distribution  tabic  is:  t  <100)  =  t  (100)  =  1.66  . 

1-.05  .95 

Since  -20.  80<< -1. 66,  the  population  means  of  the  test  items 
are  highly  significantly  different  at  o  *  .05  level  of  significance. 


Furthermore,  for  a  significance  level  of  a  -  .005,  the  tabular 

value  of  the  t-distribution  table  is:  t  „„,U00)  =  t  „„.(100)  =  2.  63  , 

1-.  005  .995 

Therefore,  since  -20.  80<< -2,  63,  the  population  means  of  the 
test  items  are  also  highly  significantly  different  at  a  =  .  005  level  of 
significance. 


Two-Sided  Test: 


For  a  significance  level  of  a  '  .  05,  the  tabular  value  of  the 
t-distribution  tabic  is:  t  <100)  =  t  (100)=  1,98  . 

*  05  ,975 


Since  1 1  |  -  20,  80>>1.98,  the  population  means  of  the  test 
itemsnre  highly  jsignif  lea  ntly  diffe  re nt_  at_o_=  .  05  level  of  significant  e . 


Furthermore,  for  a  significance  level  of  a  =  .  01  the 
tabular  value  of  the  t-distribution  table  is:  01(100)=t  995<100)=2.  63 

~2~ 

Since  1 1  |=  20,  80>>2.  63,  the  population  means  of  the  test 
items  are  also  highly  significantly  different  at  a  -  ,01  level  of 
signif  icancc, 
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Flight  03 _ Height  »  1000  feet 

7091 -A  (Test  Item  1) 
ni  =  45 
\  «  -34.57 
st  =  24.  01 


_ Velocity:  (70±5)  knots 

AN/APN-22  (Test  Item  2) 
na  =  53 
X,  *  67.96 
s*  =  29.  52 


One-Sided  Test: 


t  = 


[-34.  57  -  67.  96]  *  f- 


'  "™nbxb?)r?5T53~2) - — 

(45+53)[45(24.  01)8+53(29.  52)®] 


t  ■=  -18.42 


For  a  significance  level  of  a  1  .  05,  the  tabular  value  of 

the  t-dlstribution  table  is:  t,  ni.(80)  *  t  -.(80)  *  1.  66  . 

1  -.  05  ,95 


Since  -18.  42<<  -1.  86,  the  population  means  of  the  test  items 
are  highly  significantly  different  at  o  *  ,  05  level  of  significance. 


Furthermore,  for  a  significance  level  of  a  *  .  005,  the  tabular 
value  of  the  t- distribution  table  is:  t^_  g05^*^  *  4  9g5<80)  =  2,84 

Since  -18. 42<< -2.  64,  the  population  means  of  the  test  items 
are  also  highly  significantly  different  at  a  »  ■  005  level  of  significance. 

Two-Sided  Test: 


For  a  significance  level  of  a  *  .  OEy  the  tabular  value  of 
the  t-distribution  table  is:  t^.  q^(80)  a  t  97&(80)  *  1.99  . 

. 

Since  1 1  |  =  18.42>>1.99,  the  population  means  of  the  test 
items  are  highly  significantly  different  at  a  -  .  05  level  of  significance. 


Furthermore,  for  a  significance  level  of  a*  .01,  the 
tabular  value  of  the  t-distribution  table  is:  t  ^(80)  =  t  gg5<80)  «  2.64. 

~2~ 

Since  1 1  |  *  18.  42>  >2.64,  the  population  means  of  the  test  items 
are  also  highly  significantly  different  at  a  ■  ■  01  level  of  significance . 
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Flight  #3  Height  1000  feet 

Velocity  (70±5)  knots 

7091 -A  (Test  Item  1) 

AN/APN-22  (Test  Item  2) 

nx  =  45 

ns  *  53 

B/j  .  =  24.  01 
dmR 

8*. .  .  *  576.  48 

dmR 

s, .  .  =  29.  52 

^doR 

S8,,,  .  =  871.43 

(doR> 

F  .<52,  44)  »  871.43 

computed  576<  48 

•  1.  51 

F  gj(50,  44)  =  l.  63(tabular  value  of  F-distribution) 

F  gp(50,  44)  *  2.  00(tabular  value  of  F-distribution) 

Since  1.51  <1.63  and  1.51  <2.00,  the  hypothesis  that 

1  ,  *  08, ,  .  is  not  contradicted  by  the  observed  data  at  either 

mR'  '°oR'  significance  level  of  or  =  .  05  or  a*  .01  . 

Flight  #2  Height  «  1000  feet 

Velocity  (55±5)  knots 

7091 -A  (Test  Item  1) 

AN/APN-22  (Test  Item  2) 

nj  °  79 

ns  =  78 

s(d  B)“24-35 
mR 

s8, .  .  =  592.  92 

(dmR> 

s(d  =28.01 
oR' 

s*(d  j  =  784.  56 
oR 

F  ,,  t  j(77,  78)  =  784.56  = 
computed  502  92 

1.32 

F  gg(75,  80)  *  1.45(tabular  value  of  F-distribution) 
F  gg(75,  80)  =  1.  70(t«bular  value  of  F-distribution) 


Since  1. 32  <  1. 45  and  1.32  <1.70,  the  hypothesis  that 
.  ■  ff3.  .  is  not  contradicted  by  the  observed  data  at  either 
“oIT  significance  level  of  or  =  .  05  or  =  .  01  . 
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_ 7091 -A  ALTIMETER _ _ 

Negative  Deviations  (feet)  Mean  Confidence  Standard  Deviation 

Interval  (feet)  Confidence  Interval(ft) 


Velocity  not  treated  as  a  factor. 


velocity  of  (55+5) 


Computations  ■  Confidence  Inter  vain  for  Means  of  7091 -A 
Altimelr r  Mea su cements 

The  following  confidence  interval  computations  were  performed  only 
with  the  negative  deviations  of  the  700!  -A  Altimeter  because  of  their 
predominant  occurrence  in  the  data. 

Confidence  Limits  =  X  +  t  — — - — 

"  a 


Flight  HI  Height  *»  600  feet  Velocity:  (55*5)  knots 

n  *  51  8  *21 .  39  X*  -32. 10 

/ 21 . 39  \ 

95%  Confidence  Interval  *  -32.  10*2.01  (  ,-gg-  )  —  32,10*6.0802 

*  -  38, 1 8  to  -26.02 

99%  Confidence  Interval- -32.  10  4  2.  68  (2!=1—  j  =-32. 10  ±  8. 1070 

VSO  1 

*-40.  21  to  -23.99 


Flight  H  3  Height  »*  600  feet  Velocity:  (70*5)  knots 
n  *  25  8  *24.15  X*  -40,  96 

95%  Confidence  Interval  *  -40.  964  2.  08  flULL.'l  -40.  96  4  10.  1549 

'■/  24  ‘ 


90%  Confidence  Interval 


-40.9642.80 


-51.11  to  -30.81 
*-40,  96  ±  13.8028 

*-64.  7Bto  -27. 16 


Flight  #  2 _ Height  —  1  OOP  feet  Velocity:  (5545)  knots 

n  *  79  a  *24.  35  X*  -40.  96 

,  24, 35^ 

95%  Confidence  Interval  *  -40.  96*  1.  99  )  ‘  -40.96*  5.  4866 


99%  Confidence  Interval  *  -40.96*  2.  64 


6l.4-3.5- 

JU 


*-46.  45  to  -35.47 
*-40.  96  4  7.  2787 
*-48.  24  to  -33.68 


Flight  #3 _ Height  «  1000  feet  Velocity :  (70*5)  knots 


n  *  45  8  *24.  01 

X*  -35.  34 

95%  Confidence  Interval 

*-35.  34*2.015 

(iiJIL.) 

*-35.  34 

4  7. 

2934 

V44  ' 

-42.  63 

to  ■ 

■28.05 

99%  Confidence  Interval 

*-35.34*  2.  69 

1 24.01  \ 

*-35.  34 

4  9 

.  7361 

V44  ' 

T~« 


1  _  t. 


j  rv 


i  a - - 


Ci.  -  J  _ 


of  7091 -A  Altimeter  Measurements 


The  following  confidence  interval  computations  were  performed  only 
with  the  negative  deviations  of  the  7091  -A  Altimeter  because  of  their 
predominant  occurrence  in  the  data. 


*  v,l-a 


>  os 


Is  the  100(l-a)%  upper  confidence 
interval  for  pa 


x  V,Ct 


lathe  100(1  -a)%  lower  confidence 
limit  for  oa 


Two-sided  confidence  interval  for  the  unknown  oa 

yaa  <  a  e  ivs8  where  v  represents  the 

X*~  °  X  p  l  number  of  degrees  of  freedom. 

'*2  '  2  Or  is  the  level  of  significance, 

and  x  is  the  Chi-Square 
distribution. 

Flight  #1 _ Height  ■»  600  feet  Velocity:  (55±5)  knots 

n  =  51  s  *  21 . 39 

95%  Confidence  Interval:  50(21  ■  38)a  <  a<  50(?1 .  39)a 

71.4202  32.3574 

320.31  <  o’  <  706.  9976 
17.90  <  c  <26.59 

99%  Confidence  Interval:  50(21 ,  39)a  <  s  <  50(21. 39)a 

79.4900  0  27.9907 

287.  7922  <  o!  <  817.  2930 
1-6-96.-  <0  <  28.59 


Flight  # 3 _ Height  «■  600  feet _ Velocity:  (70±5)  knots 

n  =  25  s  ;  24,  15 

95%  Confidence  Interval:  24(24.  1 5 )a  s<  24(24.  1 5)g 

39.3641  °  12,4001 

355.  5864  <  oa  <  1128.  8086 
18.  80  <g  <  33.  60 

99%  Confidence  Interval:  24(24  1 5)a  <  s  <  24(24.  15)a 

45.5585  9.86623 

307.  2388  <  o5  <  141  5.  8463 


111 
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it  9 


-  1  ftrtfl  r._. 

— V-^r-4fc.‘-Li-  - - -  iu  > 


n  =•  78 

95%  Confidence  Interval 


8  *  2B.  01 


99%  Confidence  Interval: 


77(28,  01  >® 

77(28.  01J1 
57.  1532 

106,  629 

■  <  a  < 

566.  5543 

<  a® 

<  1057.0034 

23.60 

<  a 

<  32,51 

.77(28*01)® 
116.  321 

1  : 

V 

.77(28.01)® 

51.1720 

519.  3484 

<  o3  < 

1180. 5504 

.22.  79 

<  g  c 

34.30 

Flight  H  2 


Height  *  1000  feet  Velocity:  (55*5)  knots 


n  =  73 

95%  Confidence  Interval: 


s  *  24.  35 

78(24,35)*  -  78124. 35>® 

108.629  ff  57.1532 


433,7277  <  o 2  <  809.1927 

M.  83  <g  <8P,dfl. 


l 

l 

fi 

i 

99%  Confidence  Interval: 

78(24,  35)® 
116.321 

V 

G 

V 

78(24,35)® 
H .1 726 

c 

\ 

397.  5890 

<  cr!  < 

903.  7746 

19,1.4 

<  o  <■ 

30.06 

Flight  |  3 


Height  «100Q  feet  Velocity:  (70*5)  knots 


n  s  53 

95%  Confidence  Interval 


8  =  29.52 


99%  Confidence  Interval: 


343 


52(29.  52)“ 
71. 4202 

<  o®  < 

52(29.  52)® 
32.  3574 

634.  4756 

<  a®  < 

1400.4333 

25.19 

<  o  < 

37.42 

52(29. 52)1 
79. 4900 

<  o1  < 

52(29.  52)® 
27.  9907 

570,  0639 

A 

Q 

M 

A 

1818.9084 

23.88 

<  o  < 

40.  24 

l 


3.  Ivi£,AlN  SQUAnr.  5UCC  £.55Iv~  Dif  f  anaiKL:  A  lest  for  nanaomness 


One  of  the  tests  used  to  detect  randomness  is  the  mean  square 
difference  method.  A  brief  discussion  of  this  method  is  deemed  ad¬ 
visable  because  of  the  sensitivity  of  this  method  to  non-random  fluctu¬ 
ations.  This  method  is  particularly  sensitive  in  detecting  long-term 
trends,  periodic  or  excessively  rapid  oscillations  in  observed  data. 

Let  us  assume  that  X1#  X2,  .....  Xn  represent  n  successive 
observations  from  a  population  which  obeys  the  normal  distribution  law: 

f(X)  *  -  -1—  exp  [-  (X  -  p)a/2  ira  ] 
o*j2ir 

with  the  mean  p  and  standard  deviation  a  .  The  sample  mean  and 
standard  deviation  are  defined  respectively: 

f  xi  1 

The  mean  square  difference  is: 

6#  <Xl+l  “Xi>8 

i.  e. ,  we  compute  the  mean  of  the  squares  of  the  n-1  successive 
differences  between  the  observations. 

It  can  be  shown  that: 

E  r  $ - "I  *  1  ,  and  thus  Ga/2  is  an  unbiased 

r  s  -*  estimate  of  e8  . 

The  variance  is:  V  72-  1  s  - [1^ — r 

L  sa  J  (n-l)(n+l) 


of  6a  and  s8  ;  we  are  particularly  interested  in  the  ratio: 

68/sa  ,  since  the  disparity  between  the  values  of  fi8  and  ss 
will  indicate  the  trend  or  short  period  oscillations  in  the  observations. 
It  is  assumed  that  the  value  of  6a  will  not  be  increased  by  the  trend 
as  appreciably  as  s8  ;  hence  a  small  value  of  the  ratio  6a/s8 
will  indicate  trends  in  the  observations.  In  the  case  of  short  periods 
of  oscillations  both  63  and  sa  will  increase;  and  the  increases 
in  fia  will  be  proportionately  greater. 


The  distribution  of 


2s 


fi8 

Y~ 


e  e 


is  symmetrical  with  average  value  zero  for  random  samples  drawn  from 
a  normal  population.  For  values  of  n>25,  9  is  very  nearly  normally 
distributed  with  average  zero  and  variance  equal  to 

n  -  2 

(n-l)(n+l) 

We  can  use  the  statistic:  t  e  0/c «  and  the  percentage  point, 
for  a  standard  normal  deviate  in  testing  for  significance  of  0  for 
large  values  of  n  .  Long  term  trends  in  the  observations  would  be 
indicated  by  high  negative  values  of  t  ;  and  high  positive  values  of 
t  would  be  symptomatic  of  short  rapid  oscillations  in  the  observations. 
Significance  levels  for  the  fia/sa  ratios  have  been  tabulated  by  B.  I, 
Hart  (Significance  Levels  for  the  Ratio  of  the  Mean  Square  Successive 
Difference  to  the  Variance.  Annals  of  Mathematical  Statistics,  Vol.XIII, 
1942,  pp.  445-447). 

It  is  clear  from  Tables  7  and  8  on  the  following  pages  that 
the  values  of  t(=7j^-)  at  the  95%  confidence  level  are  not  statistically 
significant.  This  means  that  the  hypothesis  of  randomness  of  the  data 
is  not  rejected  at  the  5%  significance  level. 


velocity  of  (55+5  Knots) 


AN  EXPERIMENT  USING  NUMERICAL  ANALYSIS  TO  MODEL  A 
FUNCTIONAL  RELATION  BETWEEN  ABH  SYSTEM  SENSOR 
RESPONSES  AND  REENTRY  VEHICLE  CHARACTERISTICS 


Andrew  H,  Jenkins 
U.  S.  Army  Missile  Command 
Redstone  Arsenal,  Alabama 


INTRODUCTION «  It  Is  believed  that  prediction  models  can  be  developed 
by  the  analysis  of  experimental  data  In  light  of  the  known  physical  laws 
pertinent  to  high  speed  reentry.  The  development  of  the  model  is  accom¬ 
plished  by  numerical  analysis  of  the  full-scale  reentry  experimental  data 
obtained  on  the  eastern  and  western  test  ranges. 

In  the  past,  considerable  effort  has  been  expended  to  rigorously 
and  theoretically  describe  the  interdependent  and  interacting  phenomena 
of  hypervelocity  reentry.  This  is  a  very  complex  and  difficult  job.  In 
general,  the  basic  theoretical  relations  are  not  adequately  described 
for  ideal  conditions.  Of  more  Importance,  the  real  case  of  reentry  is 
usually  described  with  even  less  precision  than  the  ideal  case.  This  is 
not  to  say  that  progress  has  not  been  made  in  the  purely  theoretical 
approach  nor  is  it  to  imply  that  it  should  not  continue.  The  selection 
of  the  proper  variables  and  stratification  of  empirical  models  depends 
upon  such  efforts. 

The  phenomenological  processes  which  occur  during  reentry  couple 
with  the  radar  sensor  to  produce  gross  effects  in  the  measurable  re¬ 
sponses.  These  gross  effects  are  considered  to  be  typical  from  test 
to  teat,  and  differ  only  in  the  degree  or  level  of  effect  on  the  response. 
The  empirical  determination  of  the  degree  or  level  of  effect  is  to  relate 
the  sensor  responses  to  the  body  parameters  and  the  trajectory  parameters 
by  experimental  observation  without  the  full  benefit  of  a  complete  theore¬ 
tical  knowledge  to  describe  the  underlying  physics  and  chemistry  of  the 
phenomena.  This  is  graphically  depicted  in  Figure  1. 

The  variables  used  are  those  which  are  recorded  by  the  radar  system 
on  data  tape  or  published  in  data  reports  and  are,  of  course,  representative 
of  the  real-case  responses  in  a  real  time  frame.  Accurate  estimates  of 
body  characteristics  made  continuously  in  real  time  are  the  ultimate  goal 
of  this  approach.  Also,  it  is  desired  that  the  prediction  models  contain 
sufficient  physical  variables  representative  of  the  sensor,  body,  and 
trajectory  parameters  that  the  simultaneous  masking  of  all  measurables 
becomes  economically  and  practically  infeasible  for  the  offense. 

In  the  development  of  empirical  prediction  models,  the  operational 
conditions  should  not  be  ignored.  The  final  utility  of  any  techniques 
of  target  identification  depends  upon  the  capability  of  the  model  to  make 
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accurate  real-time  estimates.  Also,  the  models  should  be  fairly  easy 
and  economical  to  incorporate  into  the  defense  system,  it  should  be 
one  which  can  be  improved  in  accuracy  and  updated  as  more  knowledge 
of  the  problem  is  accumulated. 

In  the  real  operational  situation,  the  defense  system  essentially 
stands  alone.  The  identification  of  the  objects  in  the  reentry  complex 
must  be  done  in  real  time.  This  can  be  less  than  60  seconds.  The 
models  should  be  able  to  provide  continuous  estimates  of  the  characteris¬ 
tics  of  the  objects  in  the  reentry  complex.  It  is  also  highly  desirable 
that  the  model  estimation  process  converge  as  soon  as  possible  in  the 
real-time  track  to  the  beat  estimate  of  the  true  value  of  the  particular 
body  parameter  (for  example,  weight).  This  provides  a  longer  time  for 
decision  making  or  for  intercept  at  the  highest  possible  point.  The 
estimation  of  as  many  body  parameters  as  possible  is  obviously  highly 
desirable.  The  body  parameters  can  be  used  for  cross-checks  on  the 
estimated  values  of  each  other.  Every  object  in  a  reentry  complex  will 
not  be  a  simple  decision  case  of  "warhead-decoy"  even  with  a  very  precise 
model.  There  will  be  grey  areas.  Therefore,  it  is  believed  that  several 
models'for  different  characteristics  of  the  body  will  be  essential  in  the 
final  decision  to  commit  an  interceptor. 

It  has  not  been  determined  just  how  more  than  one  body  parameter 
estimate  will  be  made  in  real  time.  It  may  be  required  to  tabulate 
the  data  in  tha  font  of  discrete  time  (for  example,  altitude)  intervale 
and  develop  prediction  models  for  each  time  increment  and  body  charac¬ 
teristic  rather  than  use  one  model  throughout  the  reentry  track. 

It  is  mentioned  that  the  material  presented  in  this  report  represents 
a  minimal  effort  which  is  neither  complete  nor  concrete.  Some  of  the 
.variables  used  in  this  "first  cut"  numerical  analysis  were  selected 
because  of  expediency  and  availability  in  order  to  make  a  beginning  in 
this  approach. 

FORMULATING  THE  MODEL.  The  most  common  physical  characteristics 
of  the  body  are  weight  (W),  diameter  (D) ,  and  length  (L).  The  drag 
area  product  used  in  conjunction  with  weight  can  provide  an  estimate 
of  ballistic  coefficient  (6).  Shape  is  one  characteristic  that  affects 
the  drag  area  product  (C^A)  for  a  given  set  of  reentry  trajectory  condi¬ 
tions  and  is  reflected  in  the  value  of  C^A.  This  value  in  turn  is 
reflected  in  the  ballistic  coefficient. 

Some  measurements  that  can  be  made  by  the  radar  aret radar  cross 
section  (a),  velocity  (V),  time  derivative  of  velocity  (V),  and 
altitude  (h) .  There  are  characteristics  of  radar  such  as  wavelength 
(X)  and  aspect  angle  ($)  on  which  the  above  measurements  depend. 

The  first  body  characteristic  selected  to  empirically  determine 
the  functional  dependence  is  vehicle  weight  (W).  It  is  hypothesized 
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that  the  estimated  weight  W  Is  not  a  function  of  sensor  characteristics 

“““  £C uGry  par«3ietcro  oo  wcaeUIcu  u>‘  tuc  octueOiT  ayoLVui*  mau  ire 

HQ:  WW  f  (S ,T) , 

and  similarly  for  the  other  physical  characteristics  of  the  body 

Hq:  D*  +  f  (S ,T)  (1) 

H0s  L'  i  f  (S,T). 

The  alternate  hypotheses  are 

Hx:  W'  -  f  (S,T), 

and  similarly 

HjS  D'  -  f  (S,T)  (2) 

L'  -  f  (S,T) 

where  W',  D ' ,  L'  -  estimates  of  the  true  values, 

S  -  sensor  parameters, 

T  -  trajectory  parameters. 

The  null  hypothesis  HQ  Is  tasted  against  the  alternate  hypothesis  by 

deriving  a  model  of  U,  D,  and  L  as  a  function  of  S  and  T  by  regression 
analysis. 

The  general  multivariate  linear  regression  analysis  is  written 

y'  -  a0  +  a^  +  *2*2  +  **•  *pXp  *  <3) 

where  xp  ■  the  pth  independent  variable 
a  ■  the  true  intercept 

at. 

8^  ■  the  p  true  coefficient 
y'  ■  the  regression  estimate. 
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An  analysis  of  actual  range  data  is  made  to  test  the  null  hypotheses 
that  au  estimate  of  a  physical  characteristic  is  independent,  of  sensor, 
trajectory,  and  body  variables.  The  range  data  are  analyzed  with  a 
computer  program  that  calculates  the  regression  of  the  dependent  vari¬ 
ables  on  the  Independent  variables  by  a  stepwise  technique.  The  regres¬ 
sion  program  analysis  is  a  linear  relationship,  but  it  can  be  made  to 
accommodate  nonlinear  functions  by  any  one  of  20  different  transforma¬ 
tions,  such  as  logarithms.  The  analysis  first  calculates  the  simple 
correlation  coefficients  between  each  independent  variable  and  the 
dependent  variable.  The  variable  with  the  highest  correlation  is 
selected  for  the  first  regression  calculation.  The  linear  regression 
of  the  form 

y  -  a0  +  alXl  (4) 

is  therefore  calculated  for  one  of  the  physical  characteristics;  say  W, 
as  y,  and  the  independent  variable  with  the  highest  simple  correlation 
as  x^.  Each  of  the  remaining  independent  variables  was  then  correlated 

with  y  and  x. .  The  variable  (x^)  was  then  selected  as  the  variable  that 

produced  the  highest  of  these  correlations.  A  second  step  regression  was 
then  calculated  for  the  form 


y  -  aQ  +  +  a2x2  .  (5) 

If  the  correlation  of  regression  relationship  should  be  reduced  by 
the  addition  of  another  variable,  this  variable  was  removed.  If,  however, 
the  correlation  Increased,  the  variable,  was  retained  and  the  step  procedure 

is  repeated  for  another  variable  up  to  the  pC^  variable  and  coefficient 
as  shown  in  Eq.  (3). 

Currently,  it  is  believed  that  the  best  body  parameter  for  target 
identification  is  the  weight  of  the  reentry  vehicle.  Therefore,  weight 
was  selected  for  the  initial  effort.  Quantitative  measurements  of 
parameters  obtainable  from  the  field  sensor  are  V,  V,  h,  and  a.  The 
operational  problem  requires  that  the  prediction  model  be  expressed  in 
terms  of  the  parameters  measured  by  the  sensor.  The  radar  cross  section 
is  dependent  on  the  ratio  of  the  plasma  frequency  to  the  incident  radar 
frequency.  The  plasma  frequency  is  in  turn  dependent  on  the  strength  of 
the  shock  front  and  viscous  forces.  The  viscous  forces  determine  velocity, 
acceleration,  and  altitude  changes  as  a  function  of  time.  The  interaction 
and  interdependency  of  these  parameters  (as  well  as  others)  determine  the 
effects  of  the  entire  reentry  environment  perturbations  on  the  magnitudes 
of  the;  ',  parameters  as  measured  by  the  sensor,  as  well  as  their  histories. 
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Hence,  Che  change  in  the  Inertial  force  (F^)  of  the  body  la  caused 

by  the  drag  force  (F^)  acting  on  the  body  as  it  penetrates  the  earth's 

atmosphere.  As  an  Initial  effort  in  the  development  of  a  prediction 
model,  these  forces  were  assumed  proportional,  neglecting  gravity.  That 
is 


V 


(6) 


This  simple  assumption  is  used  as  a  basis  to  postulate  an  equation  in 

which  the  constants  and  coefficients  are  assumed  unknown  or  at  least 

different  from  the  Newtonian  values.  F.  and  F_  can  be  expressed  as 

A  U 

W 

F.  ■  ma  “  —  V  (7) 

A  g 

rD  ■  T  '"V-  <*> 


Equating  (7)  and  (8)  and  solving  for  W,  it  ia  found  that 

pv2cda  g 

W  -  ,  (9) 

2V 

The  independent  variables  of  Eq.  (9)  are  p,  V,  CpA,  and  V.  Opera¬ 
tionally  the  radar  cannot  provide  estimates  of  p  and  C^A  directly. 

Therefore,  these  variables  must  be  expressed  in  terms  of  measurements 
available  from  the  radar.  The  density  p  can  be  expressed  as 


where 


Pq  -  standard  density, 

B  -a  constant, 
h  -  the  altitude. 


(10) 


Hence,  density  is  expressed  as  a  function  of  altitude,  a  variable  which  can 
be  obtained  from  the  radar.  The  remaining  variable  C^A  can  be  expressed  as 

a  function  of  the  radar  cross  section  (a) . 

Bethe,  Edwards  and  McDonald,  and  Martin  have  studied  the  functional 
relationship  of  o  and  CflA.  The  relationship  developed  is  of  the  form 
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0»  K  fc  A1N  mi 

u 

where  K  -  a  constant, 

N  -  the  exponent. 

Substituting  Eq.  (10)  and  (11)  into  Eq.  (9)  and  rearranging  terms,  the 
following  is  obtained 


P0g 

W  =  - 

2(K)1/N 


V2  o1/N 


V  e 


Bh 


(12) 


In  the  above  expression,  W  is  expressed  as  the  function  of  a  constant 

times  the  ratio  of  V2  and  a to  V  and  h.  Since  the  relationship  is 
nonlinear,  it  must  be  linearized  for  the  regression  program.  The  equation 
is  expressed  as 


V*1  oa3 

w  -  aQ  -  (13) 

Va2  ea4h 

where  aQ  -  the  regression  constant 

a^,  a2,  a^,  a^  ■  regression  coefficients. 

Since  Eq.  (13)  is  nonlinear  and  the  regression  program  is  linear,  then 
the  equation  must  be  linearized.  Natural  logarithms  (which  may  not  be 
ideal)  were  used  to  linearize  the  equation.  It  can  be  expressed  as 


InW  -  In  Sq  +  a^lnV  +  a^lnV  +  a^lno  +  a^  h.  (14) 

SELECTION  AMD  REDUCTION  OF  REENTRY  TEST  DATA.  Analysis  of  actual 
full-scale  reentry  test  data  requires  that  a  historical  sample  of  tests 
be  selected.  The  selection  of  the  tests  requires  the  establishment  of 
certain  criteria. 

The  criterion  for  numerical  analysis  is  that  the  test  data  be 
essentially  complete  throughout  a  prescribed  trajectory  range.  That  is, 
the  radar  must  have  maintained  nearly  continuous  track.  Also,  it  was 
imperative  that  each  channel  of  track  be  accurately  identified  as  to  the 
type  of  Information.  (This  is  to  avoid  a  mixing  of  the  data  sets.) 
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me  trajectory  criteria  were  established  on  the  basis  of  the 
deposition  of  momentum  energy  into  the  disturbed  medium  through  which 
the  vehicle  passes,  The  deposition  of  energy  begins  to  be  appreciable 
in  the  continum  flow  regime  when  the  shock  forms  and  viscous  effects 
come  into  play  on  the  body.  The  effects  produced  in  this  regime  provide 
measurable  variables  which  relate  to  the  underlying  physics  of  the  inter¬ 
change  of  momentum  and  energy  between  the  body  and  ita  environment  and 
the  coupling  of  these  phenomena  with  the  sensor  responses.  If  these 
phenomena  are  assumed  to  be  typical,  then  it  remains  only  to  relate  the 
amplitudes  of  the  sensor  returns  to  the  levels  of  body  parameters.  Por 
a  typically  sized  reentry  body  at  typical  reentry  velocity  these  energy 
interchange  effects  become  pronounced  at  about  300  kft.  However ,  the 
portion  of  the  trajectory  selected  is  from  150  kft.  to  60  kft,  in  order 
to  bracket  a  region  of  maximum  kinetic  and  dynamic  Interchange  for  the 
sample  of  venicles  selected. 

The  body  criteria  were  selected  simply  to  obtain  a  sample  with  wide 
ranges  of  body  characteristics  such  as  weight,  length,  diameter,  drag- 
area,  and  shape.  One  restraint  placed  on  the  initial  selection  of 
sample  bodies  is  that  all  bodies  be  of  the  ablative  type.  The  rationale 
behind  this  on  the  initial  study  was  to  have  all  bodies  of  the  type 
which  would  at  least  unintentionally  and  somewhat  randomly  contaminate 
the  flow  field  with  typical  reentry  vehicle  materials  for  data  consistency. 

One  other  constraint  placed  on  the  selection  of  the  data  sample 
was  a  constant  radar  frequency.  Future  analyses  could  relax  this  con¬ 
straint  and  the  data  of  several  different  frequencies  could  be  used  to 
develop  a  more  universal  model.  In  the  operational  mode  bis ta tic 
measurements  may  be  made.  It  would  be  desirable  to  have  the  frequency 
variable  included.  The  prediction  model  could  be  adjusted  for  each 
particular  discrete  radar  frequency  used  by  the  system. 

After  the  crlteris  for  selection  of  the  sample  were  established,  the 
data  had  to  be  actually  selected  and  reduced  to  a  usable  format.  The 
radar  data  tapesiwere  located  and  presumably  the  proper  information 
channels  identified.  A  coupling  program  was  developed  so  that  the  data 
could  be  directly  machine  fed  from  the  tapes  into  the  regression  pro¬ 
gram.  A  printout  of  the  smoothed  values  of  the  sensor  measurements 
recorded  on  the  tapeB  was  programmed  for  a  check  on  the  data  tapes 
input  to  the  regression  program.  The  data  were  taken  from  the  data 
tapes  at  the  appropriate  time  after  lift-off,  corresponding  to  the 
established  trajectory  altitude  limits.  The  data  were  smoothed  to 
obtain  discrete  values  in  0.5-sec.  intervals.  (The  intervals  could 
be  shortened  to,  say  0.25-sec.  intervals.)  The  0.5-sec.  intervals 
provide  an  average  of  about  15  matched  set  data  points  for  each 
reentry  test  selected.  A  total  of  10  reentry  bodies  were  included 
in  the  first  analysis.  The  range  of  body  characteristics  selected 
is  shown  in  Table  I. 
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TABLE  I 


Body  shapes:  Simple  spheres  to  complex  sphere-cone-cy 1-flare 
Body  diameters:  7.5  IncheB  to  90  Inches 
Body  lengths:  7.5  Inches  to  169  Inches 
Body  weights:  17.5  pounds  to  7181  pounds 


ANALYSIS  RESULTS.  The  reentry  data  sets  as  complied  were  subjected 
to  analyses.  The  values  were  programmed  Into  the  regression  analysis  on 
the  computer,  a  total  of  150  matched  data  sets.  As  mentioned  above,  the 
data  sets  are  over  the  altitude  regime  of  approximately  150  kft.  to  60  kft. 
They  represent  about  7.3  sec.  of  reentry  time. 

These  data  were  run  on  two  different  types  of  regression  programs 
which  computed  the  same  values  for  the  constants  and  coefficients.  The 
general  regression  equation  is 

lnW'  ■  IhSq  +  a^lnV  +  a^lnV  +  a^lno  +  a^h .  (15) 

Each  regression  coefficient  was  statistically  tested  for  significance. 

Let  c^,  a2,  a^,  and  u^  be  the  true  values  of  the  regression 
coefficients  whose  estimates  are  a^,  a2>  a3,  and  a^,  respectively.  The 
following  hypotheses  are  tested: 


(16) 
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The  t  test  used  is  as  follows 


where  ■  standard  deviation  of  regression  coefficient,  a^ ,  i-1,2,3,  and  4. 
The  ccaputed  values  for  S  are  as  follows: 


ax:  S1  -  1.218 
a2:  S2  -  0.0761 
a3:  S3  -  0.4234 
a^:  -  0.0206. 

The  calculations  of  t  are: 

al  ”  ®i  8.741  -  0 

t  „  - -  -  -  7.175  (18) 

1  Sx  1.218 


a2  -  0*2  -0.086  88  -  0 

t  .  - -  -  —1.14  (19) 

L  S2  0.0761 

a3  ~  a3  0.2640  -  0 

t  -  - -  -  -  0.6236  (20) 

i  S3  0.4234 

a^  -  -0.1414  -  0 

t ,  -  - —  m  "  1  —  ■  -  -6.842  (21) 

S.  0.0206 

4 

For  a  95  percent  confidence  level  (a  -  0.05),  the  critical  value  for  t 
is  1 1.960 1  .  Therefore 

tx  -  7.175  > | 1.960 | ,  Reject  HQ, 
t2  -  — 1 . 141> | 1 . 960 j ,  Accept  Hq, 
t3  -  0 . 6 236> 1 1 . 960 1 ,  Accept  Hq, 
t^  -  —6 . 842> J 1 . 960 1 ,  Reject  Hq. 
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Therefore,  the  regression  coefficients  a ^  and  are  not  significant. 

The  regression  equation  is  now  recalculated  as 

In  W'  ■  a0  +  ax  In  V  +  a^h,  (22) 

which  expresses  the  weight  of  a  reentry  body  in  terms  of  the  velocity 
and  altitude  as  determined  from  the  radar  sensor.  The  regression 
equation  was  calculated  to  be  as  shown  in  Eq.  (23).  The  actual  values 
are  not  shown  for  security  reasons. 

In  W'  -  lnaQ  +  “  a4^»  (23) 

The  correlation  coefficient  is 


v  ■  0.602 

The  final  equation  (23)  was  used  to  calculate  point-by-polnt 
estimates  of  13  independent  reentry  object  weights.  The  range  of 
characteristics  of  these  objects  is  shown  in  Table  II. 


TABLE  II 

Shape:  Simple  .sphere  to  sphere-cone-cylinder-flare 
Diameters :  4  inches  to  40  inches 

Lengths:  12  inches  to  167  inches 
Weights:  7.5  pounds  to  3,390  pounds 


The  calculated  values  are  shown  in  point  plots  of  estimated  weight 
versus  altitude  simulating  real  time  estimates  of  object  weight.  These 
plots  are  Figures  2  through  14. 

A  composite  plot  of  all  thirteen  bodies  is  shown  in  Figure  15. 

This  is  a  semi-log  plot  of  the  best  estimate  of  weight  versus  altitude 
which  comes  out  of  a  reasonably  straight  line  which  is  expected  in  view 
of  the  transformation  of  the  data  to  fit  the  hypothesized  equation. 

You  have  noticed  that  the  plots  show  positive  or  negative  slopes 
indicating  increasing  or  decreasing  weight  estimates  as  a  function  of 
altitutde  (for  example,  time).  Only  two  plots  have  indicated  both 
positive  and  negative  slopes  where  the  true  weight  was  estimated  twice 
during  the  time-frame  of  calculation.  The  desirable  shape  of  the  real 
time  plots  of  individual  objects  is  shown  in  Figure  16.  It  would  be 
desirable  to  have  the  estimate  converge  to  an  asymptote  to  the  true 
weight  within  some  established  confidence  limits.  All  bodies  displaying 
these  curve  forms  could  be  classified  as  decoy  or  RV.  Those  outside  the 
confidence  limitB  would  be  engaged. 
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DISCUSSION,  The  results  of  the  regression  analysis  computations 
indicate  that  the  two  variables,  acceleration  and  radar  cross  section, 
do  not  significantly  relate  to  the  body  dimension  weight.  This  is  not 
to  say  that  they  are  not  significant,  but  rather  that  with  the  data 
sample  used  they  could  not  be  established  as  significant.  There  are 
reasons  which  could  account  for  the  failure  to  establish  significance 
of  these  two  variables.  One  reason  could  be  the  poor  distribution  of 
these  variables  in  the  sample  of  data.  The  poor  distribution  could 
be  due  to  the  error  of  estimating  these  quantities  by  the  radar 
sensors.  Considerable  error  could  be  contained  in  the  estimates  of 
the  negative  acceleration  of  the  body  because  of  the  azimuth  and 
elevation  rate  changes  of  the  antenna  caused  by  shifts  in  tracking 
the  electromagnetic  centroid  of  the  reentry  complex.  The  error  con¬ 
tained  in  the  radar  cross  section  is  possibly  caused  by  the  inherent 
error  in  the  CpA  radar  cross  section  relationship  used  in  the  develop¬ 
ment  of  the  hypothesized  equation. 

Another  reason  is  that  the  95  percent  confidence  level  may  be  too 
high  for  the  degree  of  precision  in  making  the  measurements.  A  further 
stratification  of  the  data  could  be  made  that  would  provide  a  range  of 
more  consistent  variation  in  the  acceleration  and  radar  cross  section 
readings.  However,  this  would  be  useful  only  for  study  purposes  snd 
would  not  improve  the  inherent  Inaccuracy  of  the  radar  system  estimates 
in  the  real  operational  case.  The  improvement  in  the  accuracy  of  the 
values  would  establish  their  significance  and  raise  the  present  cor¬ 
relation  coefficient,  of  0.602.  Weight  estimates  of  the  reentry  vehicle 
would  be  more  accurate  with  an  improved  correlation  coefficient. 
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SOME  EXPERT ENCKS  IK  LABORATORY  CONTROL  INVESTIGATION* 


Sigmund  P.  Zobel 

Cornell  Aeronautical  Laboratory,  Inc. 
Buffalo,  New  York 


1,  INTRO  .PULI  ION .  Statistical  quailty  control  as  a  way  of  life 
in  American,  Canadian,  and  British  industry  Is  over  twenty  years  of 
age.  Further,  as  a  logical  extension  of  the  scope  of  applied  statistics 
in  industry,  statistically  designed  experiments  and  analysis  of  variance 
have  been  increasingly  used  over,  perhaps,  the  last  fifteen  years.  One 
phase  of  industrial  operations,  however,  stLil  appears  to  be  rather  slow 
in  making  use  ot  the  appl : cations  of  slut  intical  methods  to  the  analysis 
and  control  of  its  routine  activities,  1  refer  to  the  typical  chemical 
laboratory,  wherein  there  are  frequently  many  ways  in  which  the  opera¬ 
tion  can  be  made  more  efficient  in  terms  of  precision  and  accuracy  and 
overall  reliability  of  analysis.  Much  of  the  potential  improvement  in 
effectiveness  can  be  delineated,  achieved,  and  preserved  through  the 
use  of  statistical  experimental  and  control  techniques. 

In  the  past  few  years,  my  colleagues  and  1  have  had  the  opportunity 
to  participate  In  the  study  of  one  such  laboratory,  whose  director 
realized  he  had  many  problems  connected  with  the  achievement  of  improved 
precision  and  accuracy  in  the  analytical  results,  and  was  quite  coopera¬ 
tive  in  permitting  experiments  designed  to  shed  light  on  the  specific 
problem  areas,  so  that  control  measures  may  be  instituted  and/or  proce¬ 
dures  changed.  Since  more  or  less  standard  techniques  of  analysis  were 
used,  describing  some  of  our  experiences  in  that  laboratory  may  be  of 
possible  use  to  others  confronted  with  similar  situations.  Objectives, 
means  of  accomplishment ,  and  conclusions  will  be  discussed  to  the  exclu¬ 
sion  of  technical  details  of  computations,  or  theory. 

Let  me  describe  the  setting  for  you.  A  wet  chemistry  laboratory 
routinely  analyzes  samples  resulting  from  a  particular  event.  Several 
Hundred  samples,  over  a  wide  concentration  range,  are  delivered  to  the 
laboratory  for  analysis.  The  analysts  Is  performed  colorimetrically , 
and  the  goal  is  to  optimize  the  precision  and  accuracy  of  the  results. 
There  are  two  different  colorimetric  instruments,  with  two  units  of 
each,  in  service.  One  Instrument  is  mainly  automatic  in  its  sample 
preparation  and  analysis;  the  other  is  largely  manual.  There  are  four 
analysts  available,  and  any  combination  or  all  four  may  be  assigned  to 
the  job  when  It  comes  into  the  laboratory.  The  numnor  of  samples  which 
may  result  from  the  event  Is  usually  larger  than  can  be  handled  by  ail 
four  analysts  during  a  regular  8  hour  shift.  When  the  budget  permits, 


*This  work  was  performed  under  Contract  No.  DA  18-033-AMC-280(A) ,  Field 
Evaluation  Division,  Technical  Support  Directorate,  U.  S.  Army  Edgewood 
Arsenal,  Maryland. 
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overtime  is  used  to  complete  the  analyses  within  one  calendar  day. 
Otherwise,  the  work  may  require  two,  or  even  three  calendar  days. 

Since  the  reagents  and  standard  solutions  have  varying  degrees  of 
perishability,  delays  may  be  deleterious  to  the  yield  estimated  from 
the  analysis. 

With  this  background  in  mind,  the  need  will  be  rather  obvious 
for  the  various  experiments  to  be  discussed  in  this  paper.  The  first 
two  or  three  wiLl  be  presented  in  greater  detail  than  the  latter 
i  Illustrations . 

2.  ANALYSIS  OF  ALIQUOT  VOLUMES.  One  factor  contributing  to  bias 
and  variation  in  results  was  believed  to  be  lack  of  uniformity  in  the 
aliquot  volumes  in  the  test  tubes  containing  the  samples.  To  obtain 
the  aliquot  volumes,  the  samples  .ire  originally  collected  in  larger 
vessels,  an  amount  in  excess  of  that  prescribed  is  poured  into  a  test 
tube,  and  a  suction  apparatus  is  used  to  draw  off  excess  liquid  to  a 
purportedly  reproducible  level.  The  test  tubes  are  in  racks  which 
hold  AO  tubes  in  a  A  x  10  rectangular  array.  Since  many  racks  are 
used  for  the  analysis  of  a  given  event,  it  was  suspected  that  rack  to 
rack  variation*  and  tube  to  tube  variation  may  be  to  blame  for  some 
bias  and  variation  in  the  analytical  results. 

A  components  of  variance  model  approach  was  selected,  since  the 
interest  lay  in  estimating  variances. 

2 . 1  Estimation  of  Rack  Variance. 

For  the  estimation  of  rack  variance,  ten  test  tube  racks  were 
randomly  selected.  Similarly,  forty  test  tubes  were  randomly  selected 
and  tared.  The  test  then  preceded  an  follows: 

The  forty  test  tubes  were  placed  in  one  rack,  filled  and  drawn 
down  to  volume.  The  tubes  and  their  contents  were  weighed,  and  the 
total  weight  of  liquid  determined  by  subtracting  the  tare  weight  of 
the  empty  test  tubes.  The  test  tubes  were,  then  returned  to  the  same 
rack,  refilled,  and  again  drawn  down  to  volume.  Reweighing  of  the 
test  tubes  and  correction  for  tare  weight  then  provided  a  duplicate 
weight  determination  for  the  given  rack.  Upon  repeating  the  above 
procedure  for  each  of  the  ten  racks,  one  obtained  ten  pairs  of 
determinations.  The  measure  of  variance  provided  by  variations  among 
the  means  of  the  ten  pairs  Includes  rack  variance  as  well  as  other 
random  effects.  On  the  otherhand,  differences  between  duplicate 
determinations  made  In  the  same  rack  provided  a  variance  estimate 
from  which  the.  rack  combination  was  eliminated.  In  this  way,  it  was 
possible  to  isolate  the  rack  component  and  compare  its  magnitude  with 


*Since  the  suction  apparatus  is  applied  to  tubes  positioned  in  a  rack, 
variations  in  racks  due  to  nonuniform  depths  of  tube  bottom  recesses 
may  contribute  to  nonuniformity  Ln  residual  volumes. 
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the  magnitude  of  the  residual  variance.  Weight,  of  course,  is  being 
used  as  a  proxy  for  volume.  Table  I  contains  the  results  of  the 
analysis  of  variance. 


ANALYSIS 

OF  VARIANCE  OF 

CODED  DATA 

Source  of 
Variance 

Sum  of 
Squares 

Degrees  o 
Freedom 

f  Mean 

Squares 

Expected 

Mean  Squares 

Between  Racks 

18,122.8 

9 

2013.6 

2  ,  ,  2 
a  +  2  Oj, 

Within  Racks 

524.0 

10 

52.4 

2 

a 

TABLE  I 

2 

2 

2  2 

From  the  above ,  one  can  solve  for  oD  ,  since  o  is  given  as 
2  R 

52.4.  a  proves  to  be  980.6.  Additional  calculations  provided  the 


interesting  result  that  the  coefficient  of  variation  representing  the 
total  variance  for  a  random  determination  on  a  random  test  tube  in  a 
random  rack  was  11.42,  while  if  determinations  are  constrained  to  the 
same  rack,  or  if  rack  variance  is  eliminated,  the  coefficient  of 
variation  could  be  reduced  to  2.6%, 

The  data  also  permitted  an  analysis  to  be  made  of  error  contributions 
by  the  individual  racks.  Noting  that  the  error  contribution  from  any 
rack  will  appear  as  a  bias  for  all  test  tubes  within  that  rack,  the 
rack  bias  may  be  estimated  by  examination  of  the  mean  weights  of  the 
40  aliquots  in  each  rack.  Table  IX  shows  seme  Interesting  data. 

One  way  to  eliminate  the  error  contributions  from  a  rack  is  to 
establish  a  correction  factor,  as  function  of  the  bias,  for  each  rack. 

A  second  and  immediately  applicable  method  would  consist  of  isolating 
the  most,  heavily  biased  racks,  such  as  those  starred  in  Table  II,  and 
either  retiring  them  from  service  or  making  some  physical  adjustment 
to  eliminate  the  bias.  The  actual  outcome  was  an  even  better  corrective 
measure.  When  the  laboratory  management  was  made  cognizant  of  the 
facts,  it  obtained  a  specially  made,  rack,  and  required  that  all  tubes 
were  to  be  drawn  down  to  volume  only  In  that,  rack,  although  this 
necessitated  one  extra  handling  step, 
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R.AOK  BIASES 


Rack 

Number 

1 

2 

3 

4 

5 

6 

7 

8 
9 

10 

*A  Significant  Deviation 


Deviation  from  Mean 
Weight  for  all  RackB 

+  4.9 

+  1.9 

+  20.4* 

2.6 
+  2.9 

+  14.4* 

-  85.6* 

+  31.4* 

+  4.4 

+  7.9 


TABLE  TI 


2 . 2  Estimation  of  Test  Tube  Variance. 

For  the  estimation  of  teat  tube  contribution  to  the  variance  of 
aliquot  weights,  forty  test  tubes  were  randomly  selected,  tared,  and 
placed  in  a  particular  rack.  These  test  tubes  were  then  filled,  drawn 
down  to  volume,  and  weighed  individually.  Correction  for  the  tare 
thus  yielded  an  estimate  of  the  volumes  of  40  randomly  chosen  test 
tubes.  The  same  test  tubes  were  then  replaced  in  the  same  rack, 
refilled,  and  the  process  repeated.  Thus,  a  pair  of  determinations 
was  obtained  for  each  of  the  forty  test  tubes  in  the  rack,  The 
measure  of  variance  provided  by  variation  among  the  means  of  the 
forty  pairs  includes  test  tube  variability  and  lack  of  reproducibility 
of  the  suction  device,  but  does  not  include  variance  introduced  by 
racks  since  only  one  rack  was  employed.  On  the  other  hand,  differ¬ 
ences  between  duplicate  determinations  made  on  the  same  test  tube 
provides  an  estimate  of  the  variance  attributable  only  to  lack  of 
reproducibility  of  the  suction  process. 

An  analysis  of  variance  similar  to  that  discussed  above  was 
performed  to  investigate  test  tube  effects.  The  between  test  tubes 
mean  square  was  significant  compared  to  the  within  test  tubes  mean 
square.  Now,  the  within  racks  variance  found  earlier  is  another 
independent  estimate  of  the  between  tubes  variance.  That  the  two 
such  estimates  are  in  excellent  agreement  may  be  seen  from  Table  III, 
which  shows  the  pertinent  coefficients  of  variation.  In  addition,  the 
measure  reflecting  the  degree  of  reproducibility  of  the  suction 
process  is  Lncluded,  as  is  the  rack  to  rack  measure. 
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COEFFICIENTS  OF  VARIATION 


Based  on  Rack  Co  Rack  11.4% 
Based  on  Within  P.acks  2.6% 
Based  on  Between  Test  Tubes  2.4% 
Based  on  Within  Test  Tubes  0.4% 


TABLE  III 

Table  III  affords  a  summary  of  the  two  experimental  studies. 
Obviously,  the  rack  to  rack  variation  is  Lhe  largest.  As  has  been 
noted  above,  however,  appropriate  corrective  action  waB  taken  to 
eliminate  this  effect.  The  fact  that  the  two  independent  estimates 
of  the  tube  to  tube  variation  are  so  close  to  each  other  clearly 
establishes  this  as  a  real  source  of  variation  in  volumes.  Further 
investigation  revealed  differences  in  tube  diameters  and  bottom 
curvatures.  These  are  standard  laboratory  supplies,  however,  and 
could  not  be  ordinarily  obtained  at  better  quality  levelB.  But,  a 
policy  was  established  to  request  that  new  orders  be  filled  from  one 
production  lot  whenever  possible,  and  some  Inspection  procedure  was  to 
be  set  up  to  examine  receipts  of  new  tubes.  The  data  also  reveal  that, 
since  the  within  test-  tubes  coefficient  of  variation  is  so  small,  and 
it  reflects  the  filling  reproducibility  of  the  suction  device,  there 
is  probably  no  problem  on  that  account. 

3,  STUDY  OF  COLORIMETERS  AND  ANALYSTS,  In  the  study  of  any 
operation,  for  the  purpose  of  enhancing  its  effectiveness,  all  sensitive 
phases  must  be  considered.  In  the  preceding  section,  the  drawing  down 
to  volume  step  was  examined,  and  placed  under  a  better  state  of  control, 
This  section  will  be  directed  at  consideration  of  the  equipments  and 
operators . 

As  noted  earlier,  there  are  two  instruments  of  each  of  two  types, 
and  four  analysts.  Thus,  an  analysis  or  group  of  analyses  may  be 
performed  on  any  one  of  four  instruments.  Unless  all  four  possess 
the  same  intrinsic  properties  of  variation,  color  perception,  transla¬ 
tion  of  coLor  perception  to  signal  output,  etc.,  each  machine  represents 
a  different  analytical  system,  and  hence,  as  its  output,  produces  results 
which  may  not  be  completely  comparable  to  the  outputs  of  the  others. 

That  Is,  ,i  set  of  results  may  be  high  or  low,  more  dispersed  or  less, 
aa  a  consequence  of  the  particular  instrument  which  generated  it.  Such 
a  situation  dilutes  the  effort  of  a  laboratory,  since,  on  the  one  hand 
real  differences  between  batches  may  be  masked  by  the  analytical  system, 
or,  at  the  ocher  extreme,  minor  differences  might  he  exaggerated  by  the 
system. 

Much  the  same  might  be  said  with  regard  to  the  several  analysts 
who  share  the  bench  work  responsibilities. 
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Because  o  L  tltse  possibilities,  the  four  instruments  and  the  four 
analysts  wore  studied  through  a  3-factor  experiment  designed  to  yield 
information  on  accuracy  and  precision  of  each  instrument  and  analyst. 
The  third  factor  was  concentration  level,  since  this  also  was  suspected 
of  contributing  to  bias  and  lose-  of  precision.  The  data  output  was 
combined  with  data  obtained  earlier  in  another  connection, 

3.1  Drift  and  Bias. 


Earlier  in  the  study  program,  the  presence  of  instrument  drift  was 
suspected,  and  reaffirmed  in  subsequent  data  analyses.  Accordingly,  one 
output  of  the  experiment  was  a  set  of  data  deliberately  designed  to  pro¬ 
vide  evidence  pertaining  to  instrument  drift.  This  was  accomplished  by 
comparing  results  from  each  of  three  concentrations  which  wera  used. 
Comparisons  were  made  between  first  and  second  members  of  pairs  of  con¬ 
secutive  samples  of  the  same  concentration,  and  early  In  the  run  and 
late  in  the  run  analysis, 

The  first  comparison  type  revealed  clearly  that  there  is  a  carry¬ 
over  effect  in  the  automatic  analyzer.  Rather  conclusive  evidence 
showed  that  if  two  samples  of  the  same  concentration  followed  a  higher 
concentration,  the  first  of  the  two  showed  a  higher  concentration  than 
the  second.  If  two  samples  of  the  same  concentration  followed  a  lower 
concentration,  the  first  of  the  two  showed  a  lower  concentration  than 
the  second.  And,  when  two  samples  of  the  same  concentration  were  first 
in  a  series  of  unknowns,  the  first  was  lower  than  the  second.  Thus, 
it  may  be  concluded  that  there  is  indeed  a  carry-over  effect. 

The  same  study  allowed  that  the  manual  instruments  did  not  exhibit 
a  significant  drift,  except  in  one  instance  of  an  apparent  interaction 
between  one  of  the  analysts  and  one  instrument.  Sincn  the  samples 
presented  to  the  manual  devico  are  each  in  its  own  test  tube,  this  is 
further  confirmation  of  the  possibility  that  the  common  test  cell  in 
the  automatic  is  not  sufficiently  purged  before  entry  of  the  next 
sample.  As  still  further  evidence  to  support  this  thesis,  It  should 
be  noted  that  all  four  analysts’  work  showed  the  upward  drift  in  the 
calibration  groups  on  automatic  No.  2.,  and  all  but  one  did  so  on  the 
automatic  No.  1.  On  manual  No.  1,  all  four  analysts  had  little  or  no 
drift  indications,  while  on  manual  No.  2  three  of  the  four  did  so.  It 
may  also  be  noteworthy  that  one  analyst  was  the  exception  in  each  of 
the  two  cases  cited.  The  difference  between  his  performance  and  that 
of  the  otlierc.  will  also  be  evident  below. 

Analysis  for  instrument  bias  revealed  that  the  two  manuals  and 
one  of  the  automatics  had  positive  bias,  while  the  other  automatic 
had  a  negative  bias.  If  only  one  analyst  hud  made  all  determinations, 
on  only  one  concentration,  the  above;  conclusion  would  be  relatively 
firm.  However,  since  several  concentrations  were  used,  interactions 
may  have  influenced  the  result*!.  That  Is,  the  amount  and  direction 
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of  bias  on  each  instrument  mav  vary  with  the  concentrations,  or  with 
the  analyst.  The  instrument-concentration  interaction  ia  shown  in 
Table  IV. 


AVERAGE  BIAS  BY  INSTRUMENT-CONCENTRATION  COMBINATIONS 


Concentration  level 


Instrument 

850 

2540 

5070 

Automatic  1 

+  48 

+  100 

-  51 

Automatic  2 

+  48 

+ 

15 

-  114 

Manual  1 

+  94 

+ 

48 

-  18 

Manual  2 

+  85 

+ 

62 

+  20 

All 

•8  69 

+ 

56 

-  27 

Bias  as  X  of  Concentration 

8* 

2% 

-  5% 

TABLE  IV 

It  can  readily  be  seen  that,  with  but  one  exception,  bias  is  an 
inverse  function  of  level,  so  that  the  low  concentration  has  the  highest 
bias  and  the  high  concentration  has  either  a  negative  bias  or  the  least 
positive  bias  for  a  given  instrument.  Stating  it  differently,  lower 
concentrations  tend 'to  be  measured  higher  than  actual  levels,  while 
higher  concentrations  tend  to  measure  lower  than  actual  values.  However, 
the  order  of  magnitude  of  the  bias  does  not  appear  to  be  any  clear 
function  of  the  particular  instruments. 

The  ultimate  breakdown  of  bias  is  according  to  instrument, 
concentration,  and  analyst.  Since  this  three  factor  interaction  was 
also  statistically  significant,  further  insight  can  be  gained  by  examina¬ 
tion  of  the  three  factor  bias  components,  as  in  Table  V. 


383 


AVERAGE  BIAS  BY  ANALYST-INSTRUMENT-CONCENTRATION  COMBINATIONS 


Instrument 


Concentration 

Analyst 

M1 

M2 

A1 

A2 

850 

1 

123 

167 

97 

103 

2 

70 

70 

37 

17 

3 

130 

75 

25 

22 

4 

30 

60 

40 

30 

2540 

1 

140 

113 

153 

117 

2 

73 

60 

107 

93 

3 

-  07 

-  07 

170 

120 

4 

-  13 

80 

-  30 

-  83 

5070 

1 

57 

-  27 

,  -133 

70 

2 

-  20 

93 

93 

-293 

3 

-155 

-  11 

82 

-  45 

4 

83 

117 

-  90 

-157 

TABLE  V 

A  quick  confirmation  of  the  concentration  effect  noted  above  may  be 
obtained  from  the  first,  appearance  of  the  negative  biases  in  the  middle 
concentration,  and  the  greater  number  of  negatives  in  the  high  concen¬ 
tration. 

More  importantly,  however,  Table  V  provides  an  entree'  for  drawing 
inferences  on  analyst  bias.  For  example,  there  are  12  instrument  con¬ 
centration  combinations.  In  9  of  these,  analyst  1  has  an  extreme  amount 
of  bias,  and  in  2  others  is  close  to  an  extreme.  Thus,  he  is  outlying, 
or  out  of  line,  in  11  of  12  possible  cases.  Similarly,  analyst  3  is  out 
of  line  in  7  cases,  followed  by  analyst  4  who  is  out  of  line  5  times, 
and  analyst  2,  with  3  times.  Further,  from  the  original  data  analyst  1 
has  an  over-all  bias  of  +82,  analyst  3  has  +25,  analyst  2  has  +18,  and 
analyst  4,  +06.  Analyst  1  is  the  same  one  who  was  the  exception  to  the 
general  pattern  of  drift  shown  by  the  analysts,  as  discussed  earlier. 

While  analyst  1  is  almost  consistently  high,  having  the  largest 
positive  bias  on  both  manuals  and  automatic  2,  the  others  were  quite 
inconsistent,  Analyst  2  has  the  largest  negative  bias,  under  analyzing 
on  automatic  2,  although  his  biases  on  the  other  three  instruments  are 
positive;  analyst  3  has  negative  bias  on  both  manuals,  and  positive  on 
both  automatics.  Analyst  4  shows  a  bias  pattern  just  opposite  so  that 
of  analyst  3. 
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3 . 2  Precision. 

A  similarly  detailed  analysis  was  made  of  precision.  Tn  the 
interests  of  brevity  only  the  findings  are  presented  herein.  Analyst 
1  stood  out  again,  this  time  for  having  the  largest  variances.  The 
other  three  analysts  showed  some  inconsistencies ,  but  not  approaching 
the  degree  indicated  by  No,  1,  In  terms  of  the  instruments,  the  two 
manual  instruments  had  higher  variances  than  the  two  automatics,  but 
not  significantly  so.  An  important  finding,  and  om-.  that  corroborated 
early  suspicions,  was  that  variance  increased  with  concentration  levels. 
Indeed,  a  good  linear  fit  was  obtained  by  tubing  the  logarithm  of  the 
variance  as  a  function  of  the  logarithm  o£  tie.-  concentration  level. 

4,  STABILITY  OF  STANDARDS.  By  now,  the  overall  program  has 
indicated  the  need  for  control  over  supplies,  instrument  operating 
characteristics,  and  efficiency  of  the  individual  analysts.  One 
important  phase  of  the  operation  which  had  not  yet  been  examined  was 
the  stability  of  standard  solutions  used  for  obtaining  calibration 
curves  and  checks  on  machine,  drift. 

Happily,  a  large  tost  run  was  in  the  offing,  and  we  were  able  to 
design  an  experiment  using  several  standard  solution  concentrations. 
Samples  were  inserted  us  blind  samples  in  each  rack  of  40  tubes  of 
ostensible  production  samples.  The  quantity  o£  samples,  coupled  with 
no  overtime  allove.d,  dictated  that  three  working  days  be  required  to 
perform  the  chemical  analyses.  Since  by  now  the  laboratory  management 
was  convinced  of  the  upward  drift  trait  of  the  automatic  colorimeters, 
they  decided  to  use  only  the  manual  instruments  for  thin  particular 
job.  Hence,  one  variable  was  eliminated  from  the  areas  ol  concern. 

The  results  of  the  analysis  were  again  In  part  confirmatory  of 
other  findings,  and  in  part  substantively  directed  at  support  of  an 
important  hypothesis.  Once  more,  for  the  third  or  fourth  independent 
case,  variance  was  found  to  be  related  to  concentration  level.  But 
more  Importantly,  the  data  shoved  that  over  a  thr«e  day  period,  con¬ 
centrations  were  not  stable,  definite  and  significant  losaes  in  levels 
were  determined  on  the  second  day  as  compared  to  the  first  day,  and 
the  third  day  compared  to  the  second,  This  effoc’  was  present 
independently  of  Instrument  or  analyst.  in  the  course  of  any  one  day, 
there  was  some  hint  of  deterioration  from  start  to  end,  but  it  was  not 
suf f ic lent ly  clear  cut  and  persistent  to  permit  a  positive  assertion. 

5.  CONCLUSION .  I  would  like  to  conclude  this  paper  by  summarizing 
the  various  applications  of  statistical  methods  which  we  have  found 
useful  In  laboratory  control  investigations,  and  the  kinds  of  answers 
Chat  were  obtained.  And,  since  this  audience  has  its  primary  interests 
in  the  applications  of  statistical  quality  control  to  laboratory  problem 
areas,  we  may  note  the  implied  laboratory  controls  whtch  were  recommended 
to  management  in  this  particular  case. 


5 . 1  Summary . 

It  is  obvious  from  my  remarks  in  Sections  2-4  that  designed 
experiments  and  analysis  of  variance  are  well  suited  to  the  study 
of  laboratory  operations  for  the  purpose  of  pinpointing  problem  areas. 
Less  obvious  are  several  other  techniques  which  were  used  to  good 
advantage  in  this  particular  program. 

Significance  tests  on  means  and  variances  helped  to  evaluate  the 
merits  of  an  analyst  versus  another,  or  one  instrument  versus  another. 
Nonparametric  tests  for  trend  assisted  in  the  investigation  of  instru¬ 
ment  drift  and  stability  of  solutions.  Regression  and  correlation 
analysis  were  also  used  in  the  study  of  the  volumes  and  concentration 
level.  In  an  ancillary  part  of  the  overall  investigation  response 
surface  analysis  was  also  used  to  good  advantage. 

On  the  strength  of  the  findings  resulting  from  the  applications 
of  these  techniques,  it  was  possible  to  confirm  many  conjectures  which 
had  previously  existed,  as  well  as  absolve  of  responsibility  for  bias  and 
variation  one  or  two  aspects  of  the  operation.  On  net  balance,  many 
recommendations  were  tendered  the  management,  including  those  discussed 
in  the  next  Bectlcm. 

5 . 2  Statistical  Control  Recommendations . 

A  laboratory  control  program  administered  by  a  suitably  trained 
individual  would  be  highly  desirable.  It  would  serve  the  purposes  of 
keeping  management  informed,  pointing  out  where  corrective  action  is 
required,  and  helping  the  analysts  to  do  their  best. 

An  a  minimum,  the  following  elements  should  comprise  the  control 
eLtort : 


1.  Control  charts  for  each  colorimetric  instrument,  to  maintain 
surveillance  over  bias  and  precision.  A  multi-vary  chart  may  be  useful 
here,  or  n  combination  of  differences  and  range  charts.* 

2.  Control  charts  on  each  automatic  colorimeter,  for  drift  control. 
Individuals  and  moving  ranges  charts  may  be  useful  here. 

3.  Control  charts  on  selected  reagents  and  other  critical  solutions, 
to  avoid  using  one  which  has  been  degraded.  Averages  and  ranges  control 
charts  should  be  useful  here,  application  being  made  to  reagents  obtained 
from  vendors  as  wall  as  those  prepared  in  house.  In  the  former  case, 

the  procedures  can  be  related  to  acceptance  sampling. 


*Those  unfamiliar  with  muitl-vary  charts  may  find  explanations  in  either 
reference  given  above. 


i.  Control  ili.u  is  oil  each  laboratory  analyst,  us  I  hr  standards 
inserted  i  a  1.0  i  lu>  production  stream  as  unknowns.  DJ.f  fo.  rentes  and  range 
-.lint'-  .it.  .ig.a.'-i  nselul,  auJ  should  be  maintained  tor  earh  regular 
ruvedure . 

I'in.illy,  !  woulii  like  lo  point  out  tlie  findings  and  recomtne.ndn t. i ons 

■  i  i  si-ussed  above  a  re.  no:  unique  to  the  particular  laboratory  concerned. 

i  have  li  nl  very  si  mi  Jar  expo  lienees  in  working  wltii  several  establishments 
-■!  i  r.-.  i,  in : .  •  i  i  I ,1  j  f  i  ..-rent  nature.  The  same  kinds:  n|‘  proiilc.,  wort*  found 
.1  -  -  .  Ml  o  i. !:.  in  .  ii  j .  I  be  helped  by  an  appropriate  slat,  inti  cal 

■  ■  - 1 :  ■■  ■  :  .  i  ■  r.  n .  I'm  s.uv  v-.ur  laboratory  'an,  Inn. 

...  ie.lgi  men  l  :  l  aui  graUiMU  lo  il.  T.  M-  Adams,  a  colleague  at 

bAL,  lor  bis  perm  Iks  .ion  to  draw  upon  one  of  Itij  studies  for  a  portion 
of  the.  ,'ibev... 
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SOME  STATISTICAL  ANALYSIS  WITH  RESPECT  TO  COMPOSITING 


IN  THE  SAMPLING  OK  BULK  MATERIAL 
A.  J.  Duncan,  The  Johns  Hopkins  University 


1.  INTRODUCTION 


The  sampling  of  bulk  material  differs  in  a  number  of  respects  from 
the  sampling  of  Individual  items.  These  differences  are  discussed  at 
Swine  length  in  reference  ll).  Une  difference  is  that  hulk  material  can 
be  physically  composited  whereas  individual  items  cannot.  It  is  thus 
possible  in  the  case  of  bulk  material  to  take  a  physical  average  in 
lieu  of  an  arithmetic  average.  Although  mixing  and  reduction  may  be 
expensive,  the  great  decrease  in  the  number  of  tests  that  have  to  be 
run  with  physical  compositing  is  likely  to  yield  considerable  economy. 

To  illustrate  bulk  sampling  with  compositing  consider  the  follow¬ 
ing  example.  An  inspector  wishes  to  determine  the  percent  nitrogen  in 
a  given  lot  of  fertilizer.  The  lot  contains  200  bags.  He  selects  20  bags 
at  random  and  with  a  sampling  tube  draws  a  small  portion  of  fertilizer 
from  each  of  the  20  bags.  These  portions  are  poured  on  to  a  rubber 
mat,  are  thoroughly  mixed  and  hand-quartered  until  there  is  just 
enough  to  fill  a  laboratory  bottle.  Two  tests  are  run  on  the  reduced 
composite  sample. 

It  is  to  be  noted  that  the  reduction  of  the  composite  sample  is 
a  form  of  subsampling  and  Is  thus  accompanied  by  sampling  variability. 

The  variance  of  this  is  called  the  "reduction  variance"  (ot2).  There 

is  thus  a  greater  variability  with  compositing  than  with  arithmetic 
averaging.  In  the  analysis  chat  follows  of2  will  be  one  of  the  terms 
in  the  sampling  variance  whenever  there  is  compositing.  Generally  it 
is  assumed  that  the  reduction  variance  is  the  same  whether  we  are 
reducing  a  large  or  somewhat  smaller  quantity.  Of  course,  if  the 
composite  sample  that  is  being  reduced  is  not  large  relative  to  the 
part  retained,  a  finite  population  correction  factor  may  have  to  be 
applied. 

Before  we  discuss  the  statistical  aspects  of  compositing,  let  us 
look  at  the  statistical  procedures  pertinent  to  a  case  in  which  there 
is  no  compositing. 
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2.  SAMPLING  WITH  NO  COMPOSITING  '• 

Lee  us  consider  a  modification  of  the  ASTM  Tentative  Recommended 
Practice  for  Sampling  Industrial  Chemicals  (E  300-66  T) .  Let  us 
consider  only  a  single  stage  instead  of  the  two-stage  plan  that  is 
actually  discussed  in  the  Recommended  Practice.  To  keep  s  concrete 
example  in  mind  suppose  the  bulk  material  comes  in  cans  -u»d  is  homo¬ 
geneous  within  cans,  but  varies  In  quality  from  can  to  can.  Assume 
that  we  have  an  isolated  lot  of  this  material  and  are  interested  In 
the  mean  of  the  lot.  With  our  modification  of  E  300-66  T  the  procedure 
would  go  as  follows: 


1.  Take  a  preliminary  sample  of  nj  (e.g.  10)  cans  and  measure 
the  quality  characteristic  of  the  contents  of  each  can. 

2,  Compute  Xj  »  EXj/nj  and 


sj2  -  E(X4  -  X j ) 2/ (ni  -  1) 


3.  Use  these  data  to  determine  on  overall  sample  else  (n)  that 
would  yield  certain  desired  criteria. 

4.  Take  an  additional  n  -  nj  cans  and  .test  the  contents  of  each. 

5.  Compute  X  -  |Xj/n  and  a2  *  E^(X^  “  X)2/(n  -  1). 

6.  Determine  0.95  confidence  limits  for  the  mean  of  the  lot. 

Thus  0.95  confidence  limits  for  u  would  be  I  +  t.  n„,e//n  where  t  ... 

U.v25  .025 


Ms  the  0.025  point  of  a  t  distribution  with  a*  1>  degrees  of  freedom. 


3.  SAMPLING  WITH  COMPOSITING 

Now  consider  the  above  example  if  after  the  n  cans  are  selected 
they  are  physically  composited  and  a  single  test  made  on  this  composite. 
Assume  that  the  ni  preliminary  cans  are  measured  as  before.  By  this 
compositing  we  have  reduced  the  coat  of  inspection  by  the  cost  of 
n  -  nj  -  1  tests.  We  have  added,  however,  the  cost  of  compositing  n 
cans  and  reducing  this  for  running  a  single  test. 


3.1  WHEN  BASIC  VARIANCES  ARE  KNOWN 


The  variance  of  the  single  composite  measurement  (Xc)  will  be 


(1) 


where  o2  is  the  product  variance,  o  2  is  the  variance  of  reduction  and 

A  T 

cg2  is  the  variance  of  analysis.  If  we  knew  all  three  of  the  variances, 
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0.95  confidence  limits  for  u  would  be  give"  by 


v  1/: 

M-  +  o  ”-  +  a  2 

In  r  a 


3.2  WHEN  BASIC  VARIANCES  ARE  UNKNOWN 

Let  us  see  what  we  can  do  if  we  do  not  know  the  basic  variances. 
Note  that 


X  -  u 
c 


hr  +  o£  +  ai 


4i(X  -  m) 

_____  c 

1 1  /  2  loj  +  n  i2  +  n>  j  ?  'i  1  /  2 


is  normally  distributed  with  sero  mean  and  unit  variance.  Also  note 
that  Si2  contains  both  the  product  variance  (o^2)  and  the  analytical 

variance  (o  2)  which  we  will  assume  are  Independent  so  that 

A 


E(si2)  -  c  2  +  a  2 


(nj  -  l)s i 2 

Then  g*  y~"  a""I> —  has  a  x2  distribution  with  nj  -  1  degrees  of  freedom. 
X  a 

It  follows  that 

/n(Xc  -  y)  I  Sj 

(o2  +  no2  +  no2)1^/ (c2  +  (h)1/2 

has  a  t-distribution  with  -  1  degrees  of  freedom. 

The  above  statistic  would  appear  to  be  of  little  use  to  us  since 
we  do  not  know  the  basic  variances.  Note,  however,  that  it  can  be 
rewritten  in  the  form: 

«^n(X  -  u)  1  +  a  2/-  2  l-/2 

— f; —  r+  ,v??V-,v7y  <2> 

r  X  a  X  J 

Thus  if  we  know  the  ratios  ar2/c^2an c  o2l  c^2 ,  the  statistic  could 
be  used  and  good  guesses  as  to  the  ratios  might  work  out  fairly  well. 
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For  example,  with  n  ■  20  variation  In  the  ration  from  .8  to  1.2  causes 
the  factor  In  brackets  to  vary  only  from  .208  to  .234,  so  that  a  202 
margin  of  error  In  estimating  the  ratios  would  causa  only  a  deviation 
of  0.02  to  0.03  in  the  value  of  t,  hardly  enough  to  have  a  a  gnif leant 
effect  on  the  probabilities  Involved. 

If  we  wish  to  set  up  approximate  0.95  confidence  limits  for  p  in 
this  case,  we  would  have 


0.95  confidence  limits  for  u 


3.3  WHEN  OUTSIDE  ESTIMATES  OF  VARIANCE  ARE  AVAILABLE 


If  we  do  not  know  ov2,  o  2,  a  2  but  have  independent  estimates, 

ATI 

we  can  proceed  as  follows.  Let  a  2  be  an  estimate  of  the  reduction 
variance  based  on  fr  degrees  of  ffeedom  and  let  s(2  be  an  Independent 

estimate  of  the  analytical  variance  based  on  f#  degrees  of  freedom. 
Then  we  will  have  8 

E(si2  -  s#2)  -  ox2  +  c#2  -  o#2  •  ax2 

and  an  estimate  of  the  variance  of  X„  will  be 


This  is  a  weighted  sum  of  variances  so  that  following  Satterthwalte  [4] 

u[s  2  +  „Z  B  2  +  ,  1  ] 


Hence  the  statistic 


(X„  -  p) 
c _ 


T75 


+  Jla-=-y«  *  +  s 2 

n  a  r 


will  have  approximately  a  t  distribution  with  u  degrees  of  freedom  and 
0.95  confidence  limits  for  u  will  be  given  approximately  by 


X 

c 


.111  + 
n  n 


+  s  2 


r 


1/2 

'  C.025  (u) 


(3) 


4.  SAMPLING  A  STREAM  OF  LOIS 

The  preceding  discussion  was  concerned  with  an  isolated  lot  and 
what  kind  of  inferences  we  could  make  about  the  mean  of  the  lot  under 
varying  circumstances.  The  approach  to  the  analysis  was  strictly 
classical.  Here  we  shall  consider  ASTM  Tentative  Methods  for  Mechanical 
Sampling  of  Coal  (D2234-65T)  and  the  point  of  view  will  be  Bayesian. 

4.1  ASTM  D2234-65T 

i'iS'. 

Sampling  of  coal  on  a  conveyor  belt  consists  of  taking  n  Increments 
of  coal  systematically  from  the  belt,  compositing  these  increments, 

reducing  the  composite  sample  to  a  laboratory  sample  and  making  a  u 

determination  of  quality  such  as  ash  content  or  the  like.  D2234-65T 
offers  a  solution  to  the  problem  of  how  many  increments  should  mnke  up 

the  sample.  The  solution  calls  for  a  preliminary  determination  of  the  *1 

basic  variances  and  is  based  on  the  assumption  that  the  values  so 
determined  continue  to  be  valid  for  routine  sampling  of  subsequent  lots. 

4.1.1  THE  STOCHASTIC  ASSUMPTIONS  ABOUT  THE  STREAM  OF  COAL  ;; 

The  program  offered  by  this  standard  is  based  on  a  hypothesis 
regarding  the  nature  of  the  coal  being  sampled.  This  is  that  the 
variations  of  quality  in  the  stream  consist  of  two  kinds;  one  is  a 
local  variance,  the  other  a  "trend"  or  "segregation"  variance.  It 
is  as  if*  the  coal  came  in  large  segments  which  varied  in  average 
quality  from  segment  to  segment,  a  measure  of  which  is  the  trend 
variance  c  2,  while  within  the  segments  there  is  random  variability 
the  variance  of  which  is  designated  as  U|2  (since  it  applies  to  one 
lb.  increments).  The  wlthln-segment  variance  is  assumed  to  be  the 
same  for  all  segments.  Measurements  of  the  quality  of  individual 


*Note  that  this  is  considered  to  be  an  approximate  working  model, 
not  a  true  model.  cf.[5]. 
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increments  of  w  lbs.  of  coal  taken  at  random  from  various  segments  with 
no  more  than  one  Increment  per  segment  would  thus  have  a  variance  equal 

to  o,2  +  o  2  +  a  2  +  a  2  (where  In  this  case  a2  refers  to  the  variance 
t  r  a  r 

w 

resulting  from  the  reduction  of  the  w  lbs.  of  coal.) 

4.1.2  THE  PILOT  STUDY 


D2234-65T  calls  for  a  pilot  study  for  determining  the  basic 
variance  components  for  a  given  coal.  The  study  provides  for  the 
collection  of  30  seta  of  two  samples  from  a  stopped  conveyor  belt. 

"Each  of  the  30  sets  of  samples  Includes  a  very  small  sample,  to 
furnish  data  for  the  random  variance,  and  a  large  sample,  to  furnish 
data  for  the  system  (trend)  variance.  Since  one  of  the  Important 
components  of  variance  Is  that  due  to  segregation  It  is  essential 
that  the  30  sets  of  samples  be  eo  distributed  with  respect  to  time 
that  coverage  of  all  subtypes  of  coal  are  represented"  [6]*  The  stop lee 
era  to  be  taken  by  a  two  taction  Belt  Divider .  "One  of  the  sactiona 
should  be  approximately  the  width  corresponding  to  three  time#  the 
top  alee  of  the  coal  end  should  trap  e  sample  of  between  4  end  20 
lb.  The  other  section  should  be  approximately  the  width  corresponding 
to  20  times  the  top  else  of  the  coal  and  should  trap  a  sample  between 
B0  end  150  lb."  (6].  Designate  the  small  samples  by  the  latter  A 
and  the  large  samples  by  the  letter  B.  The  subeamplsa  A  are  reduced 
aay  by  a  riffle  to  laboratory  samples  of  between  100  and  200  grams. 

These  era  ground  to  -60  mesh  for  analysis.  The  subsamplee  B  are  also 
worked  down  to  laboratory  tamp lab  and  ground  to  -60  maah  for  analysis. 

The  variances  of  the  A  and  B  results  ere  measured  by  the  usual 
formal.  .  ^  -  JC)2 

n  -  1 


Mow  if  the  weight  of  the  A  samples  it  wj  and  that  of  the  B  samples 
is  Wj ,  then  assuming  wj/wj  to  be  an  integer  and  letting  a  2  be 

wl 

the  random  variance  for  increments  of  coal,  weighing  W]  lbs.,  we  have 


and 


Expected  value  of  a  2  ■  a  2  +  c  2  +  a  2  + 
a  z  Wj  r 


Expected  value  of  s„z  -  o  2  +  -^*o  2  +  o  2  +  a  2 

B  t  w  w,  r  e 
2  1 


An  unbiased  estimate  of  oy  2  will  be  given  by 


8  2  - 

W 

1 


M,( 


A 


Wj  —  W] 


and  for  1  lb.  Increments,  the  random  variance  can  be  estimated  at 
8, 2  »  w,8  2 

1  1  Wj 


To  estimate  the  trend  variance  ot2,  multiply  s  2  by  wi  and  s^2  by  W2 
and  subtract.  This  yields 


W2  SB2  ‘  W1  9A2 

- »  »  *  -  a  j- 


w  -  w. 
2  1 


where  sr2  and  sfl2  are  estimates  of  the  reduction  and  analytical  varlancea 

obtained  from  another  pilot  study  that  need  not  be  reviewed  here.  The 
above  estimates  can  be  used  to  obtain  an  estimate  of  the  variance  of 
an  increment  of  any  weight  w.  Thus 


9  2  .  -L  +  0  2 

w  w  t 


+ 


+  V 


A  composite  of  n  increments  would  have  an  estimated  variance  equal  to 

9,  2 


■J-  +  9  2 

2 - +  •  2  +  S  2 

n  r  a 


(4) 


4.1.3  DETERMINATION  OF  n  FOR  SUBSEQUENT  SAMPLING 


The  last  formula  is  employed  by  the  ASTM  standard  to  determine 
how  many  increments  should  be  used  in  future  routine  sampling  of  a  lot. 
Thus,  proceeding  in  e  crude  manner,  the  standard  notes  that  0.95  confi¬ 
dence  limits  for  the  mean  of  any  given  portion  of  coal  from  which  n 
increments  of  weight  w  have  been  taken,  composited,  reduced  and  tested 
would  be  given  roughly  by 


X  t  1,96 
c 


-i-  +  8  2 

w _ t 


+  8_  +  8. 


1/2 


eo  if  we  wish  the  confidence  interval  to 
cake 


A  -  1.96 


V 

—  +  8  2 
w  t 


+  s_ 


be  of  width  2d, 


1/2 


+  e. 


2 


then  we  would 


and  solve  for  n.  The  standard  takes  A  ■  .10  i'  where  u  is  a  good  guees 
as  to  the  mean  quality  of  the  consignment  of  coal. 
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’>***  Msm-  isiimmiin  .  ms  * atii k. 


4.1.4  ESTIMATION  OF  THE  MEAN  OF  THE  CURRENT  LOT 


With  the  number  of  Increments  determined  ae  described  In  4.1.3, 
the  mean  of  e  current  lot  is  estimated  by  taking  the  prescribed  number 
of  Increments  from  the  current  lot,  compositing  them,  reducing  the 
composite  to  e  laboratory  sample  and  analysing  a  specimen  from  this 
sample.  The  result  X  is  taken  as  an  estimate  of  the  quality  of  the 
lot.  From  the  way  inwhich  the  number  of  increments  was  decided  this 
estimate  la  expected  to  be  within  102  of  the  true  mean  of  the  lot. 

If  confidence  limits  are  desired,  they  could  be  crudely  determined  by 


U  -  X„  ±  1.96 
c 


V 


+  \2 


+  er2  + 


1/2 


4.2  USB  OF  PRIOR  INFORMATION  IH  ESTIMATING  THE  MEAN  OF  A  CURRENT  LOT 


It  will  be  noted  that  while  the  prior  information  in  variances  Is 
used  in  Section  4.1,4  to  sat  up  crude  confidence  limits  for  the  lot  mean, 
no  use  la  made  of  the  mean  of  the  pilot  study.  The  question  may  be 
raised,  however,  as  to  whether  the  mean  of  the  current  lot  would  not 
be  better  estimated  by  a  weighted  average  of  the  pilot  study  mean  and 
the  meaaurement  X  made  from  the  current  lot.  The  argument  would  run 
like  this.  If  the  various  lots  of  coal  were  really  large  samples  from 
the  stream  of  coal,,  their  Individual  means  would  probably  differ  very 
little  from  the  mean  of  the  whole  stream  end  the  best  estimate  we 
could  make  of  the  mean  of  a  current  lot  would  be  an  estimate  of  the 
mean  of  the  stream  based  on  all  the  information  available  for  making 
such  an  estimate.  Suppose,  for  example,  that  in  the  pilot  study  the 
mean  of  the  30  large  (wg  lb.)  samples  was  X  end  the  mean  of  the  30 

_  *2 

smell  (wi  lb.)  samples  was  X  . ,  then  an  estimate  of  the  mean  of  a 
subsequent  lot  on  which  we  have  a  composite  sample  measurement  X 
could  be  taken  as  c 

30w2  *  +  30wj  Xwi  +  nwXc 

*1  - - 

30wj  +  30wj  +  nw 

where  w  is  the  weight  of  the  n  increments  composited  in  the  sampling 
of  the  current  lot.  An  alternative  estimate  that  omits  the  small 
samples  of  the  pilot  study  would  be 

_  30w2XW2  +  nwXc 

"  30w2  +  nw 


This  would  be  based  on  less  data.  The  reduction  in  the  amount  of  data 
would  not  be  great,  however,  and  as  discussed  below,  it  mifht  be  feasible 
to  use  X2  in  a  supplementary  test  of  e.lgnif icance  but  not  Xj. 


Pooling  Che  pilot  study  with  the  current  lot  measurement  should  be 
preceded  by  a  statistical  test  to  determine  whether  the  assumption  that 
the  two  se*-s  of  data  came  from  the  same  population  is  a  reasonably  valid 
one.  Actually  it  will  be  sufficient  to  test  whether  the  meant  differ 
signigicantly .  If  we  plan  to  use  X}  above,  the  test  would  be  baaed  on 
30w2Xw  +  30w!X 

a  comparison  of  '  ~3oJT  +"30wT  '™J —  w*tl1  Xc  or  **  W*  to  UB®  X2,  on 

a  comparison  of  X  with  X  . 
v  w2  c 


It  is  necessary  at  this  point  to  interrupt  the  argument  and  to 
note  that  if  the  sampling  and  analytlcial  procedures  used  is  as 
described  in  Section  4,1.4  above,  there  will  be  only  one  measurement 
made  on  the  current  lot,  viz.,  X  ,  and  this  will  provide  us  with  no 
information  on  the  variance  of  tfia  current  lot.  In  order  to  be  able 
to  run  the  suggested  significance  test,  it  will  be  necessary  for  the 
sampling  and  analytlcial  procedures  to  be  modified  to  give  some 
information  on  variability  of  the  current  lot.  To  accomplish  this 
it  Is  recommended  that  in  lieu  of  a  single  composite  sample,  4  separata 
composite  samples  be  formed*  and  mMsured  separately.  The  mean  of  the 
four  separate  composite  measures  (X  )  would  take  the  place  of  the 
single  composite  measure  X  and  thevariance  of  the  4  separata  composite 
means  would  yield  an  estimate  of  the  variance  of  the  currant  lot. 

Thus,  we  would  have 


and 


X„,  +  X„„  +  X„,  +  X  , 

Cl  C2  C3  C4 


and  j  s£2  would  be  an  estimate  of  the  variance  of  Increments  of  weight  w 
from  the  current  lot. 


Returning  to  the  discussion  of  the  significance  tests,  it  Is  to  be 
noted  that  if  we  set  a  ■■  30w2/  (30w2  +  30wi)  and  b  ■  30wi/  (30w2  +  30wi), 
the  variance  of  the  weighted  mean  of  the  pilot  study  data  would  be 


o 


2 

weighted  mean 


a'-CTTT 2  +  b2orr  2  +  2abron-  orr 


w2 


wi 


X  X 
wj  w2 


where  r  is  the  correlation  between  Xi  and  X2.  In  practice  this  would  be 
estimated  by  a2Sg2/30  +  b2s^2/30  +  2abrsBa^/30  where 


*If  the  increments  are  taken  systematically  from  the  lot  then  incre¬ 
ments  1,5,9, .. .could  be  mixed  to  form  composite  1,  Increments  2,6,10,... 
could  be  mixed  to  form  composite  2,  Increments  3,7,11, .. .could  be  mixed  to 
form  composite  3,  and  increments  4,8, 12, .. .could  be  mixed  to  form  composite  4. 
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X.  -  X 
iw2  w2 


)( 


X.  -  X 
iwi  wi 


29Va 


and  a-2  and  a.2  ara  as  dafined  in  Section  4.1.2  above.  The  variance  of  X 

BA  C 

would  be  estimated  by  sc2/4.  A  crude  teat  of  the  difference  between  the 
two  means  would  therefore  be  given  by  treating 


30w2XW2  +  30WlXWi 
30v?2  +  30v  \ 


-  X 


fe2sB2  +  b2sA2  +  2»brsBsA 


30 


1/2 


(5) 


as  if  it  were  normally  distributed.  If  this  does  not  fal)  beyond  *1.96  or 
-1.96,  we  could  conclude  that  it  is  safe  to  make  the  pooled  estimate.  If 
the  given  statistic  falls  beyond  ±1.96,  pooling  is  not  recommended  and  Xc 

alone  should  be  taken  to  estimate  the  mean  of  the  current  lot. 


The  quantity  r  would  have  to  be  computed  from  the  original  pilot 

study  data.  If  this  Information  is  not  available,  then 

30w2X  +  nwX 

‘  w2  c 

x2  -  - 


could  be  used  in  place  of  Xi 
would  be  given  by  treating 


30w2  +  nw 

In  this  instance  a  crude  significance  test 


X  -  X 
w2  c 


30 


-  + 


172 


(5a) 


as  if  it  were  normally  distributed,  again  comparing  It  with  ±  1.96. 


If  there  is  a  series  of  lots  to  be  inspected  from  the  stream  of 
coal,  the  mean  of  each  lot  that  passes  the  significance  test  could  be 
pooled  with  the  pilot  study  mean  and  other  past  lot  means  that  have  met 
the  significance  tests.  This  pooled  mean  would  then  become  the  point  of 
reference  with  which  the  mean  of  the  current  lot  (X^)  would  be  compared. 
The  significance  test  would  in  this  case  be  carried  out  by  treating 


^Pooled  ~  X 


Pooled  Mean 


172 


(5b) 
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as  if  it  were  normally  distributed,  where  for  m  past  means 

j0w2XWz  +  nwj1  Xcl 

^Pooled 


30w2  +  mnw 


30w2 


and  for  g  -  '  and  h  “  -rr- — — 

30w2  +  mnw  30w2  + 


mnw 


Pooled  "  *2bB/30  +  h2Jlaci/4 

Comparison  would  again  be  with  ±  1.96,  although  owing  to  truncation  the  a- 
risit  would  now  ha  less  than  0.05.  The  basic  variances  are  assumed  unchanged. 

4.3  A  MORE  GENERAL  BAYESIAN  PROCEDURE 


An  all  around  and  somewhat  more  sophisticated  approach  as  to  how  to 
use  the  pilot  study  data  in  making  inferences  about  the  lot  mean  would  be 
to  use  Bayes  Theorem  in  which  the  prior  distributions  are  based  on  the 
pilot  study  data.  Our  probabilities  would  now  become  rational  degrees  of 
relief,  but  they  would  not  be  entirely  subjective  in  that  if  the  mathema¬ 
tical  procedure  is  accepted,  the  degrees  of  belief,  l.e,  the  probabilities, 
follow  directly  from  the  analysis.  The  procedure  will  be  to  not*  the 
various  likelihoods  and  prior  distributions  and  then  apply  Bayes  theorem 
to  get  the  posterior  dietributlon  for  u»  the  lot  mean. 


We  shall  brgln  by  taking  the  composite  mean  Xc  to  be  distributed 
normally  about  tl»..  lot  mean  v  with  variance  equal  toc 

oi2/w 


+  o2  1  / 

—r. -  +  0  2  +  0  2  /  4 

iy4  r  a  / 


If  we  set  ax2  ■  oi2/w  +  crt2,  this  becomes 


Then  the  density  function  for  Xfi  given  m  is 


f<xj  u)  -  A- 


-  (Xc  -  u)2/2  (<jx2  +  or2  +  oa2) 


+lL_r 


1/2 


(6) 


✓2ir 


I 

1 


Next,  let  the  variance  of  the  four  composites  Xlc  be  sfi2  ■  E(X^c  -  *c)2/ 
(4  -  1).  The  expected  value  of  ac2  will  be 
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3  4V 

E(«c2)  -  — - —  +  or2  +  oa2  and  va  shall  assume  that  3s 2  divided  by 

this  expected  value  has  e  ^distribution  with  3  degrees  of  freedom. 
Thus  the  distribution  of  a2,  given  the  variances  o  2,  o  2  and  o  2. 
will  be  c  X  *  r  a  » 


h(*£|ax2. 


3  ,  -  2)/2 

ixp  (-3s| / 2 ( [ 4c»2  /  n 

+  Or2  +  0«2)  ) 

[4a  2 

1—2—  +  a  2  +  a  2 
n  r  a 

w  i 

4 72  - <7> 

where  K  is  a  factor  of  proportionality.  Since  the  two  densities  are 
independent  of  each  other  their  joint  density  will  be  the  product  of 
the  two  individual  densities. 

We  shall  express  our  degrees  of  belief  about  prior  distributions 
as  follows: 


Prior  distribution  for  w  la  proportional  to  —  • e”  “  ^o)2/2ox2  (8) 

5X 

where  N  is  the  slse  of  the  lot  and  uo  is  the  grand  mean  of  the  pilot  study. 

The  assumption  here  is  that  the  lots  are  merely  large  samples  from  the 
stream  of  coal. 


Prior  distribution  for  ox2  is  proportional  to 


"£oX<0oX2)/2oX2,  , 

_2 - - -  *  <°nx) 


2-  V2 


(o-*)1  +  V2 

A  o 

a  i2 

wh#r«  <JoX2  i»  the  pilot  study  estiaete  of  -g-  +  at2  and  foX  is  the  degrees 

of  freedom  on  which  it  is  based.  This  is  a  congpgate  prior  for  the  distri¬ 
bution  of  i2. 

c 

Prior  distribution  for  of2  is  proportional  to 


(9) 


■fnr(an 2>/2o  2  ,  f  /2 

or  or  r  ■  (o„„2)  or 

m~72 


(ar2) 


or 


(10) 
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Prior  distribution  for  a  2  ia  proportional  to 

0 


-f  (o  z)/2o  2 

oa  oa  a 


f  /2 
<o  2>  08 


,.l  +  f  /2 
(oa2)  oa 


these  are  similar  in  form  to  the  density  assumed  for  cx2. 

The  product  of  all  the  above  densities  would  be  the  joint  distribution 
of  the  likelihoods  and  priors.  Unfortunately  the  expression  la  too  compli¬ 
cated  to  handle  analytically.  It  is  possible  to  do  something,  however 
if  we  neglect  of2  and  aj2  and  merely  include  y  and  a^2.  Since  a*2  and  a*2 

are  likely  to  be  small  relative  to  a2,  the  approximation  may  not  be  bad. 
Confidence  limits  based  on  it  will  indicate  limits  that  ara  leas  than 
the  true  ones.  They  will  offer  a  lower  bound,  however. 

Proceeding  as  indicated,  we  would  hsve,  omitting  non-ralevent  factors, 
the  following  Joint  distribution  of  likelihoods  and  prlora.  (Note  that 
f  and  a  are  now  used  Instead  of  f  v  and  a  „  since  there  la  no  longer  a 

need  to  distinguish  pilot  study  verlencae.)  Thus  f(X  ,  a2,  y ,  a2)  would 

C  C  X 

be  proportional  to 

-n(Xo  -  y)2/2 ox2  (||2.  "3,c/2(4oX2/n) 


°X/4* 


x  -M(y  -  uq)2/2ox2 

V 


,  -3s2/2(4o  2/n) 
<«;>■  c  x 
(40x2/n)4/2 

-fo(Co2)/2ox2 

i  +  f  n 
u  *> 


fa/2 


where  \x2  is  the  pilot  study  mean  end  a  2  ia  tha  pilot  study  varianca. 

If  we  integrate  (12)  over  y,  this  leaves,  except  for  factors  not 
containing  ox2, 

[  r  (n3?  +  My  )2  3ns  2  ,  }  , 

— - —  +  Mu  2  +  nX  2  +  -7—  +  f  5  /  2a  2 

n+M  0  c  4  ooJ/Xj 


exp  - 


y  * v*. 


Set  the  expression  in  brackets  equal  to  H,  so  that  the  quantity  becomes, 
except  for  factors  not  involving  a x2. 
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-(f  +  7)/2 
(0  2)  ° 


The  next  seep  la  to  integrate  over  c  2.  If  we  set  x2  “  H/c  2  so  that 

u  AX 


H  a 

*  *  -  "YxVyr  ^X2>  the  integral  becomes 

,-x!/2  .  ♦  J)/2  .  (x!)-(I0  *  3)/!  .  d](2 


which  is  proportional  to 

H"(fo  +  S>'2 

The  joint  posterior  distribution  is  therefore  given  (except  for  a 
proportionality  factor)  by  Expression  (12)  *  H'*o  +5)/2  t 

To  get  the  marginal  distribution  for  y ,  which  is  the  posterior 
distribution  of  y,  we  integrate  the  joint  posterior  distribution  over 

ox2.  The  part  of  Expression  (12)  •  H^o  +  *^2  containing  cx2  is 

-rn(X.  -  y)2  +  +  M(y  -  y_)2  +  f.o  2  l/2o  2 


(f  +  8)/2 

<°X  > 

If  we  sat  G  equal  to  the  expression  in  the  brackets  and  put  x2  *  G la 
the  integral  becomes 


f  e~x2/2  G 


dx2  which  le  proportional  to 


G-^o  +  6^2.  Accordingly,  except  for  a  proportionality  factor, 

G~^o  +  ^2  is  the  posterior  distribution  of  y. 

Now  G  cen  be  put  in  the  form 

f  nX  +  My  1 

G-(°  +  M)  (y - jL— S-)2  +  V 


where  V  -  — 


nX  +  My 
c  o 


t  MU02  +  f0o02 
+  n  +  M 


fo  +  6 


Dividing  by  V  and  absorbing  V 


in  the  proportionality  factor*  va  have 


nX  +  My 
_ c _ o 


f  +3+1 
o 

2 


fo  +  6 


is  proportional  to 


/v/(f0  +  5) 


f  +  5 
o 


t2 - 2 - 

Since  the  above  expression  is  of  the  form  (—  +  1)  *  it  follows 

that  the  posterior  distribution  of  a  function  of  u  (not  tf  itself)  has  the 
form  of  a  t -distribution  and  that  Bayasian  confidence  intervals  for  y  can 
be  obtained  from  this. 

nX  +  Mb 
c  o 


Thus  the  probability  that 


V/(fo  +  5> 


lies  between  -t  Q25  and  +  t  Q25  for  fQ  +  5  degrees  of  freedom  equals  0.95. 

Hence  the  probability  is  0,95  that y lies  between 
nX  +  My  ' 

'n"+*M‘^'  *  +  5)  •  t  Q25  for  f^  +  5  degress  of  freedom  (15) 


These  are  consequently  the  Bayesian  0.95  confidence  limits  for  y.  The  limits 
given  by  (15)  are  tighter  than  the  true  ones  since  it  will  be  recalled  that 

a  2  and  a.2  have  been  neglected, 
r  a 

It  la  interesting  to  note  that  the  sample  sice  n  appears  in  V  in  a 
peculiar  way.  If  we  divide  both  the  numerators  and  denominators  of  the  two 

terms  of  V  by  n,  we  get  in  the  numerators  the  quantities  ■  ,  — - —  , 

f?  II  Ti 

ouo 

and  -  and  in  the  denominators  M/n.  If  we  increase  n,  M/n  goes  down 

^  My  My0z  f  u  2 

less  rapidly  than  -~-fl  or  — - —  and  the  reduction  due  to  ■  is  gratis. 
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This  verifies  what  should  be  the  case,  vis.,  that  as  n  increases  the 
confidence  interval  becomes  smaller.  The  same  applies  to  fQ  the  degrees 

of  freedom  for  the  pilot  study  of  c^2 .  Likewise  for  the  lot  else  M. 

Finally  it  should  be  noted  that  the  analysis  indicates  that  the  lot 
siss  M  should  be  taken  as  the  weighting  factor  for  u  in  getting  the  average 
estimate  of  u.  This  stems  from  the  special  assumption  that  was  made  about 
the  prior  distribution,  viz.,  that  the  lots  were  merely  large  samples  from 
the  process  of  size  M  and  thus  had  a  variance  of  °x2/h.  If  the  special 

assumption  about  the  prior  distribution  of  u  is  that  it  is  normal  with 
mean  u  but  with  variance  a  Z/M  where  M  is  an  arbitrary  constant,  the 
model  would  still  hold.  In  either  case  the  weighting  would  be  inversely 
proportional  to  the  variances,  as  would  be  expected. 

4.4  A  COMPARISON  OF  APPROACHES 


It  is  of  Interest  to  conclude  with  a  comparison  of  the  results  yielded 
by  the  formula  of  Section  4.1.4  with  Bayesian  confidence  limits  yielded  by 
formula  (15).  The  comparison  will  be  made  by  numerical  examples. 


Suppose  that  20  increments  of  50  lbs.  each  are  taken  from  a  current 
lot  to  form  a  composite  sample  the  measurement  on  which  (X  )  is  10.  Suppose 
that  the  random  variance  for  1  lb.  increments  (oi2)  and  the  trend  variance 
(ot2)  have  been  estimated  from  a  pilot  study  to  be  7.6  and  1.2  respectively.* 

And  suppose  another  pilot  study  yields  an  estimate  of  the  sum  of  the  reduction 
and  analytical  variances  (sf2  +  s#2)  as  equal  to  0.0465.*  Then  using  the 


formula  of  Section  4.1.4  above,  approximate  0.95  confidence  limits  for  the 


mean  of  the  lot  would  be 


U  -  10  ±  1.96 


7.6 

50 


+  1.2 


20 


+  0.0456 


1/2 


-  10  ±  0.66 

Suppose  now  that  instead  of  a  single  composite,  4  composite  samples  are 
taken  from  the  current  lot  each  baaed  on  5  increments  of  50  lbs.  each  so  that 
n  still  equals  20.  Let  the  mean  of  the  4  composite  measurements  (Xfi)  be  10 

and  let  their  variance  (s  2)  be  0.32.  Let  the  mean  of  the  pilot  study  data 

(uo)  be  12  and  let  the  size  of  the  lot  (M),  measured  in  50  lbs.  increments, 


*These  figures  are  taken  from  the  illustrative  material  given  in  the 
ASTM  standard  Methods  for  Sampling  of  Coal  (D2234) . 
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be  800  such  Increments.  Finally,  let  the  number  of  degree*  of  freedom  (fQ) 
on  which  the  pilot  study  estimate  cf  the  product  variance  oj-  la  baaed 

be  30.*  Then  with  the  same  values  for  dj2,  d  2  and  a  2  and  a  2  aa  before 
we  will  have  ^  2  * 

a  2  -  -i-  +  d  2  -  ~  +  1.2  -  1.35 
o  w  t  50 

and  the  0.95  Bayesian  confidence  limits  for  the  mean  of  the  current  lot 
would  be  (according  to  formulas  (14)  and  (15)) 


*The  number  of  degrees  of  freedom  f  Is  derived  by  Satterthwelte's 
approximation  as  follows :  Tha  pilot  stuSy  referred  to  In  the  coal  sampling 
standard  D  2234  yielded  s  2  ■  29.2  and  Sg  »  1.3  (See  Section  4.1.2  of  thia 
paper).  In  deriving  these  variances,  wj  -  0.27  lb.  was  taken  as  the  else 
of  Che  small  samples  and  W2  “  10.6  lbs.  was  taken  as  the  size  of  the  large 
samples.  With  w  -  50,  these  figures  yielded  (in  accordance  with  Section  4.1.2 
above)  A  2 


a  d» 


0.27(1.00255) (a.2  -  a2) 

+  ,«  2  „  - - - - -  .. 

50  ct  50 


+  1.00255  sn2  -  0.002554  a  2  -  s  2 

0  A  ra 

where  e  2  ia  the  sum  of  the  reduction  and  analytical  variances.  Thia  can 
be  put  in  the  form 

o  2  -  0.004725  a.2  +  0.9825  a  2  -  a  2 
o  A  B  ra 

Each  of  the  estimates  of  variance  was  baaed  on  29  degrees  of  freedom  eo  that 
the  degrees  of  freedom  for  0  2  [following  Satterthwaite  (See  A.J.  Duncan, 
Quality  Control  and  Industrial  Statist'  a.  3rd  ad.,  p.  605)]  was 

(0.004725  a  2  +  0.9825  a  2  -  a„  a)2 
f  - - A - B - ra - 

0  (0.004725  aA2)2  (.9825  Sg2)2  (»ra2)2 


29 


29 


29 


which  with  a,2  ■  29.2,  s„2  ■  1.3  and  s  2  «  0.0465,  gives  f  ■  30. 

A  D  0 
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20(10)  +  600  C12) 


20  +  BOO 


±  *.025(30)  ^V,/35 


where 


,[20(10)  +  800(12) 
“  [  20  +  800 


"|2  3^0)  (0.32)  +  20(10)2  +  800(12)2  +  30(1.35) 

¥  - - - 


20  +  800" 


-  -142.80  + 


4.6  +  117200  +  4.05 
820 


-142.80  +  142.93  +  gj0*0,5 - •  .13  +  -  0.19 


This  yields 

p  -  11.95  i  2.04/. 0054  -  11.95  ±  0.15. 


As  could  have  been  anticipated,  the  relatively  heavy  weighting  given  y  as 
compared  with  that  given  X  causes  the  Bayesian  limits  to  be  centered  closS  to 
Further,  the  assumption  that  the  lots  are  merely  random  samples  from  the 
process  yields  a  prior  distribution  for  the  lot  mean  that  has  a  variance 
o^2 /M  which  for  large  M  is  small.  This  is  what  accounts  for  the  much  tighter 

confidence  limits.  With  large  M  therefore  there  will  be  a  marked  difference 
in  the  results  yielded  by  the  two  procedures. 


It  is  not  necessary,  however,  for  the  validity  of  the  Bayesian  model 
that  the  various  lots  be  assumed  to  be  random  samples  from  the  process  with 
a  variance  equal  to  o^2  divided  by  the  lot  size.  As  noted  above  it  Is 

possible  to  view  M  simply  as  an  arbitrary  constant  which  expresses  the 
assurance  we  have  about  the  location  of  the  lot  mean.  Thus,  if  we  take 
M  ■  80  instead  of  800  as  in  the  previous  example,  the  prior  distribution 
for  p  will  have  much  greater  dispersion  which  means  that  our  prior  know¬ 
ledge  as  to  the  value  of  p  is  much  less  certain.  In  this  case,  the 
Bayesian  confidence  limits  for  the  mean  of  the  current  lot  will  be 


u  .  20(10)  +  80(12) 
M  100 


*  *.025(30)  */v73T‘ 


where 

V 


20(10)  +  80(12) 
100 


20(10)2  +  80(12)2 
100 


+ 


45.3 

100 


-  -  134.56  ¥  135.20  +  .453 


.64  +  .453  -  1.093. 


This  yields 

v  -  11.6  ±  2.04  /0.0512  -  11.6  +  .46 

a  result  that  is  much  closer  to  that  yielded  by  the  formula  of  Section  4.1.4 
above. 


If  now  we  do  not  wish  to  assume  any  prior  knowledge  regarding  Che  mean 
of  the  current  lot  (even  the  mean  of  the  pilot  study  data  la  considered 
Irrelevant),  but  we  are  willing  to  assume  a  prior  distribution  for  the 
variance,  then  we  can  modify  the  Bayesian  analysis  by  putting  M  *  0.  If 
we  do  this,  our  confidence  limits  for  u  become 

u  -  X  ±2.04^/35’" 
c 

where  3ns  2/4  +  f  0  2 

.  c  _ oo 


which  for  the  example  in  hand  becomes 

y  .  —lA  .  2,27 

20 


This  yields 


li  -  10  ±2.04/0.0649  -  10  i  .52. 

If  we  allow  for  the  omission  of  the  variances  of  reduction  and  analysis, 
this  is  almost  the  same  result  as  that  given  by  the  formula  of  Section  4.1.4. 
The  conclusion  seems  warranted  therefore,  that  the  formula  of  Section  4.1,4 
is  the  practical  equivalent  of  a  Bayesian  confidence  Interval  whan  we  ara 
willing  to  use  the  pilot  study  data  to  give  us  prior  distributions  for 
the  basic  variances  but  are  unwilling  to  make  any  prior  assumptions  about 
the  lot  mean. 
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PANEL  DISCUSSION  ON  BULK  SAMPLING 


Chairman:  Walter  D.  Foster,  Biological  Laboratories,  Fort  Detrick, 

Frederick,  Maryland 

Discussant:  Acheson  J.  Duncan,  The  Johns  Hopkins  University, 

Baltimore,  Maryland 

Panelists:  Boyd  Harshbarger,  Virginia  Polytechnic  Institute, 

Blacksburg,  Virginia 

Henry  Ellner,  U,  S.  Army  Materiel  Command , 

Washington,  D.  C, 

Gene  Ray  Lowrinore,  Hercules,  Inc.,  Radford  Army 
Ammunition  Plant,  Radford,  Virginia 

Joseph  Mandelson,  Edgewood  Arsenal,  Maryland 

Vernon  H.  Rechmeyer,  Thiokol  Chemical  Corporation, 
Huntsville  Division,  Redstone  Arsenal,  Alabama 


Since  the  host  installation  for  the  Fourteenth  Conference  on  the 
Design  of  Experiments  has  a  special  interest  in  chemical  and  other  forms 
of  bulk  sampling,  the  Program  Committee  decided  to  have  a  group  discus¬ 
sion  in  this  area  of  statistics.  Dr.  Walter  Foster  agreed  to  serve  as 
chairman  of  the  panel  and  to  select  several  experts  to  help  him  explore 
this  field. 

Three  papers  on  bulk  sampling  appear  in  this  technical  manual.  The 
preceding  article  by  Professor  Acheson  Duncan,  and  the  next  two  papers, 
one  by  Joseph  Mandelson  and  the  other  by  Gene  Lowrlmore. 


CHEMICAL  SAMPLING 

Joseph  Mandelson,  Edgewood  Arsenal,  Maryland 

The  problem  of  sampling  of  chemical  materials  has  never  been  solved 
on  an  overall  basis  and  is  not  likely  ever  to  be  solved  in  this  manner. 

By  an  "overall  basis,"  I  mean  the  establishment  of  a  standard  such  as 
Military  Standard  105  applicable  to  all  materials  which  contain  classi¬ 
fiable  quality  characteristics  and  to  which  an  AQL  can  meaningfully  be 
assigned.  In  the  past,  a  number  of  standards  have  been  prepared  governing 
the  sampling,  inspection,  and  test  of  chemicals  (e.g.  ,  ASTM,  AOAC,  etc.), 
but  each  standard  is  specific  for  one  material  and  usually  applicable  to 
only  one  type  or  grade  of  that  material.  Thus,  an  ASTM  standard  for 
testing  quicklime  will  tell  you  nothing  about  sampling  of  reagent  grade 
CaO.  And  that  is  as  it  should  be. 
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I  believe  we  can  handle  the  problem  only  by  recognizing,  in  detail, 
what  our  objectives  are  and  by  Indicating  what  can  be  done  to  handle  *anh 
type  of  oDjeetlve. 

First  off,  there  are  three  general  ideas  which  are  of  the  greatest 
significance  in  this  area. 

a.  The  concept  of  percent  defective,  which  is  basic  to  Military 
Standards  105  and  41 A,  and  in  terms  of  which  AQL's  are  expressed,  has 
no  meaning  in  connection  with  testing  chemicals  per  se.  Of  course, 
inspection  of  factora  such  as  packaging,  packing,  and  marking  of 
chemicals  can  be  accomplished  using  AQL's  and  Military  Standard  105, 
but  not  the  actual  testing  of  the  chemical. 

b.  Probably  the  most  important  characteristic  to  be  defined  In 
planning  chemical  sampling  is  the  degree  of  lot  homogeneity  required. 

This  must  be  determined  within  the  framework  of  the  actual  way  the 
material  Is  used  and  by  the  importance  of  the  material  in  that  usage. 
Examples  of  this  will  be  Indicated  later  on. 

c.  While  Military  Standards  105  and  414  asauma  that  no  inspection 
error  occurs  and  their  OC  curves  are  plotted  accordingly,  the  actual 
existence  of  error  merely  results  in  the  translation  of  the  OC  to  the 
right  or  left  depending  on  the  kind  of  error  made.  In  chemical  sampling, 
we  have  no  OC  curve  (because  the  abclssa  is  a  percent  defective),  but  we 
do  have  an  experimental  error  which  may  or  may  not  be  large  enough  to 

be  significant.  In  any  case,  the  size  of  experimental  error  can  ba 
determined  (assuming  competent  testing  personnel)  and  the  causes  thereof 
ascertained,  tn  every  case,  the  acceptance  criteria  set  must  raflsct 
the  irreducible  experimental  error  while  the  actual  sampling  and  teat 
procedures  must  be  hedged  about  with  specified  technical  precautions 
to  hold  these  errors  to  as  near  these  minima  as  possible. 

Now,  let  us  see  how  these  general  ideas  affect  the  problem  of 
chemical  sampling: 

a.  Military  Standard  105  has  appeared  to  many,  if  not  moat 
Quality  Assurance  engineers  like  Lydis  Pinkham's  pills,  a  cure-all 
for  whatever  ailments  you  have.  They  preacrlbe  its  uss  for  anything 
and  everything  -  including  chemical  sampling  and  sampling  for  des¬ 
tructive  test.  The  small  sample  sizes  contained  in  the  S  levels  of 
Military  Standard  105  are  particularly  citad  though  unsulted  for  thasa 
two  purposes.  For  chemical  sampling,  the  sample  sice  indicated  in  the 
S  levels  depends  upon  the  AQL  prescribed  and,  as  already  indicatad, 

AQL  is  rarely  of  significance  in  chemical  testing.  For  example, 
for  lot  size  of  1000  (packages,  I  suppose)  level  S-2,  Military 
Standard  105  prescribes  sample  code  letter  C,  which  calls  for  a 
sample  of  3  for  4, OX  AQL,  a  sample  of  8  for  6.5X  AQL  and  a  sample 
of  5  for  all  other  AQL's.  Of  course,  the  allowable  number  of  defects, 
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which  has  little  if  any  meaning  in  chemical  sampling,  differs  with 
sample  size  and  AQL.  So  what  do  we  do?  Obviously,  we  had  best  avoid 
quoting  Mi  i  i  tat  y  SidiiudJ  u  10  »  ctuu  yut  a  adwpi4ii^  Lai^  i-c  wi  wu  if  w«iu  litwC 
the  specification.  As  an  afterthought:  suppose  we  write  a  specifica¬ 
tion  for  technical  grade  acetone.  This  can  come  in  any  number  of 
commercial  packages,  trom  o-gaiion  cans,  5b-gallon  drums,  to  tank 
cans.  Imagine  using  level  S-2  or  any  other  quote  from  Military 
Standard  105! 

b.  Before  we  discuss  the  problem  of  homogeneity,  I'd  like  to 
point  out  that  chemical  tests  are  or  should  be  specified  for  accomplish¬ 
ment  in  replicate  (that  is,  in  2  or  more  parallel  determinations) . 

Results  arc  expected  to  vary  due  to  experimental  error  so  It  is 
possible  (and  it  frequently  occurs)  that  one  replicate  will  appear 
to  fail  with  respect  to  one  or  more  quality  characteristics  while 
others  may  meet  the  requirement.  We  usually  allow  the  average  to 
govern.  But  this  is  not  always  spelled  out  in  the  specification. 
Furthermore,  the  use  of  such  undeclared  decision  criteria  ignores  the 
fact  that  certain  requirements  are  far  more  important  than  others  so 
that  the  average,  by  itself,  may  be  insufficient  to  Insure  a  desirable 
product.  In  fact.  In  many  cases,  an  exact  parallel  exists  with  the 
concept  of  classification  of  defects  ae  used  in  sampling  and  inspection 
In  accordance  with  Military  Standard  105. 

For  instance,  for  a  vaccine,  acceptance  will  require  that  no  living 
virus  be  observed  in  any  of  the  many  replicate  samples  taken  from  the 
batch.  This  corresponds  to  the  Military  Standard  105  critical  defect. 
Further,  the  number  of  units  per  gram  or  ml.  of  material  is  very 
Important  since  dosage  depends  on  precise  control  of  this  figure. 

It  may  be  possible  to  admit  of  some  variation  such  that  one  or  more 
of  the  replicates  may  be  permitted  to  fall  somewhat  below  the  specified 
minimum  provided  the  average  is  not  less  than  this  minimum  while  the 
variation,  measured  as  a  standard  deviation,  Is  not  greater  than  some 
prescribed  maximum.  This  corresponds  to  the  major  defect  concept. 

There  may  be  additional  requirements  (e.g.  specific  gravity,  etc.) 
of  lesser  importance  where  the  average  alone  may  be  permitted  to 
govern.  These  are  equivalent  to  the  minor  defect. 

We  can  see,  therefore,  that  the  more  important  the  requirement, 
the  greater  the  need  for  the  lot  to  be  homogeneous  and  the  more 
stringent  the  evidence  required  to  prove  it.  Also,  you  must  now  be 
aware  that  the  requirement  for  homogeneity  stems  from  the  way  the 
material  is  used  and  what  it  is  supposed  to  accomplish . 

By  contrast  with  vaccine,  let  us  consider  FS:  chlorosulfonic 
acid  -  S03  solution.  The  most  Important  requirement  is  total  acidity. 

However,  in  its  use  as  a  smoke  agent,  if  the  total  acidity  were  5X 
below  the  specified  minimum,  it  is  doubtful  that  you  could  see  any 
difference  in  the  smoke  it  made.  So  this  would  be  a  minor  characteristic, 
even  though  It  is  the  most  Important  one,  and  the  average  of  replicate 
determinations  on  a  composite  sample  would  be  sufficient  to  govern. 
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In  determining  homogeneity,  it  Is  usually  po««ihi»  tn  » 

single  characteristic,  not  necessarily  the  one  which  is  of  critical 
interest,  to  prove  it.  Major  characteristics  require  a  number  of 
individual  samples  and  replicate  tests  of  each  but  all  minor  charac¬ 
teristics  can  be  determined  on  a  composite  sample.  With  this  general 
guidance  and  your  knowledge  o£  the  material  and  how  and  why  it  is  used, 
meaningful,  economic  chemical  sampling  can  be  devised.  One  way  of 
insuring  a  degree  of  homogeneity  is  to  prescribe  that  product  shall 
contain  material  from  not  more  than  one  batch  of  chemicals.  The  batch 
is  defined  as  that  quantity  of  material  manufactured  by  some  unit  i 

chemical  process  or  subjected  to  some  physical  mixing  operation  in- 
tended  to  make  the  final  product  substantially  uniform.  This  is  a 
minimum  requirement  in  production  of  a  homogeneous  product. 

c.  In  all  chemical  measurements  explicit  consideration  must  be 
given  to  the  experimental  error  of  the  specified  procedure.  It  is 
frequently  taken  that,  in  a  well-run  laboratory,  the  most  common 
source  of  error  lies  in  reading  the  instrument;  e.g.,  0.02  ml  for 

ordinary  burettes,  0.1  mg  for  the  analytical  balance,  etc.  Any  t 

experienced analyst  knows  this  premise  is  highly  optimistic  and  that  j 

reading  errors  comprise  only  a  fraction,  perhaps,  but  a  email  fraction,  1 

of  the  total  error.  Most  Important,  however,  is  recognition  of  the  f 

fact  that  the  assumption  of  a  constant  laboratory-wide  error  is  pure 
fantasy,  that  every  procedure  has  its  own  inherent  error,  and  that 

this  is  modified  by  the  personal  error  of  the  analyst,  sampling,  and  f 

the  like.  For  this  reason,  specification  criteria  can  be  intelligently 

and  fairly  established  only  when  and  if  a  valid  estimate  of  experimental 

error  is  provided.  This  is  easier  said  than  done.  ] 

If  we  want  to  determine  the  experimental  error  of  a  procedure,  we 
must  ask  whether  this  will  be  done  under  "ideal"  conditions  or  undar 

those  obtaining  in  an  ordinary  laboratory  using  "normal"  precautions;  i 

whether  to  use  the  most  proficient  analyst  or  the  journeyman.  Maraly 
to  state  the  question  is  to  indicate  how  difficult  it  is  to  implement 
the  decision. 

So  you  see,  there  is  no  quick  and  easy  answer  to  chamlcal  sampling. 

Each  case  must  be  considered  by  itself.  Frequently,  a  recognised 
sampling  standard  for  a  material  of  similar  characteristics  may  ba 
used  as  a  guide  but  considerable  technical  soul-searching  is  required 
before  you  snatch  at  this  straw.  The  excellent  specifications  put  out 
by  ASTM,  AOAC ,  etc.,  are  based  on  long  experience  with  the  specified 
commercial  chemicals  but  each  refers  only  to  the  specific  material 
covered.  They  provide  excellent  guidance  -  but  they  are  only  guides 
not  answers  to  all  problems. 

The  importance  of  proper  sampling  is  scressed  in  many  texts  on 
chemical  analysis  but  the  advice  given  is  frequently  ignored  in  practice. 

It  is  well  known  that  a  sample,  improperly  taken,  can  vitiate  the  results 
obtained  by  the  most  competent  analyst  using  the  moBt  sophisticated  methods 
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l 

and  apparatus  available.  Yet,  In  practice,  because  the  actual  preparation 
of  a  sample  usually  requires  considerable  physical  exertion,  the  teak  Is 
allowed  to  devolve  upon  laborers,  operating  under  vague,  imprecise  in¬ 
structions  which  they  understand  imperfectly,  if  at  all.  For  example, 
what  do  laborers,  Indeed  many  professional  analysts,  know  of  the  special 
connotations  hidden  in  the  deceptively  simple  requirement  "take  a  random 
sample?"  What  do  they  know  of  the  techniques  and  tools  which  must  be 
%_  employed  to  lueure  true  randomidty? 

1  pHr. 

I  f:  Ideally,  the  analyst  will  be  thoroughly  trained  in  the  art  of  taking 

|  *'  samples,  in  seeing  and  knowing  how  to  ovarcome  the  many  unforesaen  dif¬ 

ficulties  which  arise  in  every  sampling  environment.  Such  a  man  should 
take  and  prepare  the  sample  himself ,  but  this  is  rarely  practical.  As 
an  alternative,  there  is  no  objection  to  the  expedient  of  having  the  sample 
taken  by  non-professional  personnel  provided  always  they  are  unr1  r  the 
direct,  personal  supervision  of  a  competent  Individual.  Inde  if  they 
have  been  suitably  trained  in  every  aspect  of  the  task  under  the  conditions 
they  will  face,  the  continual  pretence  of  thi  supervisor  jeay  not  be  re¬ 
quired.  However,  assurance  must  be  given  in  ell  cases)  thet  ths  individual 
taking  the  sample  is  hlmsalf  knowledgeable  or  that  tib  is  acting  in  accord¬ 
ance  with  the  explicit  instructions  of  a  competent  person,  preferably  an 
experianced  analyst.  All  too  often  snvir-  -mental  changes,  not  necessarily 
always  mtteorological  in  nature;  produce  unditiona  not  envisioned  by  the 
specification  writer,  which  must  ba  overcome  to  produce  a  proper  sample. 
Only  a  competent,  knowledgeable  supervisor  of  sampling  personnel  can  be 
entrusted  with  the  responsibility  for  devising  necessary  additions  to  and 
modifications  of  the  prescribed  procedure  (and  documenting  these)  to  insure 
that  a  proper  sample  is  taken  in  the  circumstances. 

A  great  weakness  in  many  analytical  chemists  ia  their  lack  of 
familiarity  with  the  statistical  considerations  involved  in  the  phenomenon 
of  experimental  error.  This  is  not  to  say  that  chemists  are  unaware  of 
or  underestimate  the  importance  of  experimental  error.  It  la  simply  the 
case  that  ao  many  of  them  do  not  know  how  to  handle  it  or  even  know  it  can 
be  handled.  Fortunately,  modern  curricula  have  replaced  old-fashioned, 
inefficient  statistics  (e.g.,  average  deviation,  etc.)  with  more  modern, 
efficient  concepts  auch  aa  standard  deviation  but  it  remains  a  matter  of 
concern  whether  sufficient  emphasis  has  been  placed  on  teaching  the  student 
the  dangers  of  bias  and  how  to  avoid  them,  the  true  meaning  of  randomlcity 
and  how  to  effect  it,  the  components  of  variance  and  how  to  calculate  them 
and,  more  generally,  how  to  employ  statistics  in  analytical  chemistry. 

Sampling  error  (?)  is  a  significant  factor  in  overall  experimental 
error.  When  determined  as  part  of  a  factorial  experiment,  a*  will  fre¬ 
quently  turn  out  to  be  surprisingly  high  as  compared  with  other  components 
of  experimental  error.  For  this  reason,  the  reduction  of  og  to  a  minimum 

is  an  important  factor  in  improving  chemical  testing.  To  effect  this 
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objective,  It  is  essential  to  use  valid  statistical  methods  to  determine 
o  so  that  alternative  methods  of  sampling  may  be  evaluated  by  quantitative 
dltermination  of  a  and  that  procedure  adopted  which  has  demonstrated  the 
lowest  sampling  erfor.  It  is  interesting  to  note  that  normal  statistical 
test  methods  (e.g.,  analysis  of  variance)  will  not  only  measure  a  ,  but 
will  usually  identify  the  causes  of  error,  thus  furnishing  leads  Is  to 
what  can  be  done  to  reduce  or  eliminate  them. 
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COMMENTS  ON  BULK  SAMPLING 


Gene  R.  Lowrlmore 

Hercules,  Inc.,  Radford  Army  Ammunition  Plant 
Radford,  Virginia 


Professor  Duncan,  in  his  presentation,  discussed  compositing  sa  an 
integral  part  of  the  methodology  of  bulk  sampling.  My  comments  will 
not  be  directed  to  bulk  sampling,  per  se,  but,  I  think  they  are  pertinent 
to  the  question  of  what  happens  when  we  composite,  or  blend.  Normally, 
when  we  draw  a  test  unit  in  a  bulk  sampling  situation,  we  assume  that 
it  consists  of  a  very  large  number  of,  say,  particles.  In  contrast,  if 
the  test  unit  consisted  of  only  one  particle,  we  would  be  in  a  discrate 
sampling  situation. 

At  Radford  Army  Ammunition  Plant,  we  manufacture  a  musher  of  cannon 
propellants.  The  smallest  identifiable  unit  of  one  of  these  propellants 
is  a  grain  or  fairly  large  particle,  for  example,  .1"  by  .8".  These 
propellants  are  manufactured  in  a  stream  of  batches  and  a  large  number 
of  batches  are  combined  through  a  blending  process  to  form  a  lot.  Test 
units  are  drawn  from  the  lot  and,  consequently,  contain  grains  from  a 
number  of  batches. 


Because  a  charge  weight  correction  la  made  for  every  lot  at  firing, 
the  lot  mean  is  of  secondary  interest  to  ua.  The  within- lot  variance  is 
our  primary  concern,  since  it  is  directly  related  to  lending  round  after 
round  on  target. 

We  have  undertaken  a  mathematical  investigation  of  the  test- to- test 
or  withln-lot  variability  in  terms  of  the  batch-to-batch  and  within-batch 
variability.  In  our  investigation,  we  assumed  that  the  true  value  for 
the  test  unit  is  the  sum  of  the  values  for  the  particles  making  up  that 
test  unit.  This  assumption  allowed  us  to  exploit  the  analogy  between  this 
situation  and  the  situation  in  sample  survey  theory  where  we  are  estimating 
a  total  from  a  stratified  sample.  We  have  some  results  for  the  case  where 
the  number  of  grains  in  the  test  show,  N  ,  is  much  greater  than  the  number 
of  batches  blended, 

We  are  currently  studying  the  situation  where  N.  may  actually  be  less 
than  N,.  All  batches  cannot  now  be  represented  in  tne  test  unit.  We  hope 
to  gain  some  Insight  into  what  happens  to  the  withln-lot  variability  in 
this  case. 

These  investigations  have  provided  ua  with  valuable  insight  into  the 
relationship  between  discrete  and  bulk  sampling  and  what  compositing  doss 
in  sons  bulk  sampling  situations. 
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SOME  STATISTICAL  ASPECTS  OF  ASSURANCE  OF  STERILIZATION 


F.  M.  Wadley 

Consultant  to  Fort  Datrlck,  Frederick,  Maryland 


In  biological  research,  we  often  deal  with  assurance  of  sterilization 
or  disinfection,  especially  in  microbiological  work  and  in  peat  quarantines. 
We  desire  assurance  that  our  procedurea  will  give  protection  against  sub¬ 
sequent  infection. 

Often,  we  cannot  be  entirely  sure  of  100%  kills;  circuits taneas  of 
treatment  may  not  be  perfect,  or  the  population  treated  may  be  very 
slow  in  approach  to  100%  mortality.  The  probit  transformation,  widely 
used  in  dosage-mortality  studies,  does  not  allow  mathematically  for  100% 
kill,  though  it  can  be  approached  as  closely  as  desired.  Some  well- 
qualified  workers  in  the  mortality  field  prefer  to  define  exparlmantally 
a  very  small  risk,  which  can  be  accepted.  The  assurance  is  then  that 
the  probability  of  any  survival  is  very  small  indeed,  and  that  with 
ordinary  numbers  treatad,  survival  of  even  one  individual  will  be  re re. 

This  viewpoint  is  discussed  by  A.  C.  Baker  (1939).  It  seems  more 
realistic  than  speaking  of  100%  kill,  and  helps  to  keep  preliminary 
testa  to  e  manageable  volume. 

For  this  reason,  studies  of  assuranca  may  deal  in  very  low 
probabilities;  perhaps  one  survival  in  thousands  or  millions.  The 
probabilities  are  defined  by  preliminary  work,  which  must  obviously  be 
quite  extensive  and  Involve  great  numbers  of  individuals.  Sometimes  a 
limited  extrapolation  to  greater  numbers  or  lower  survivals  is  used. 

It  is  desirable  to  be  thorough  in  preliminary  tests  without  going  to 
a  prohibitive  amount  of  work. 

Very  low  percentage  counts  are  involved,  and  these  can  be  treated 
as  blnomlally  distributed  if  care  is  used.  The  close  relation  of  the 
Poisson  distribution  to  the  binomial  can  be  utilized  with  some  gain  in 
convenience,  where  percentages  are  near  zero  or  100,  and  numbers  ere 
large.  For  example,  suppose  an  estimate  of  3  par  10,000  average  survival, 
or  a  proportion  of  0.0003.  Using  the  binomial  estimates  of  distribution 
of  survivals  can  be  made  from  several  terms  of  the  binomial  (0.000  3  + 

0.9997)  Using  the  Poisson,  distribution  of  survivals  can  be 

estimated  simply  by  expanding  the  Poisson  with  mean  3,  or  by  looking 
in  published  tables.  This  is  true  for  survival  estimates  of  3  per 
10,000;  3  per  1,000;  or  3  per  million.  Student  (1907)  showed  that  the 
binomial  approaches  the  Poisson  at  its  extreme  proportions  with  n  large. 

A  recent  inquiry  to  the  Physical  Defense  Department  at  Fort  Detrick, 
referred  to  establishing  an  assurance  that  chance  of  contamination  be 
not  over  1  in  1  million.  The  material  in  question  was  a  biological  fluid 
to  be  transported  under  stressful  conditions.  The  frequently  used  method 


ol  heat  sterilization  could  not  be  used  because  heat  would  alter  the 
fluid.  Filtration  was  to  be  used.  The  treatment  is  described  by 
Fortner,  Phillips,  and  Hoffman  (1967). 

Extensive  tests  were  made  with  reusable  and  disposable  filters, 
dealing  with  large  populations  of  Sarratla  marcs scene.  The  best 
filters  gave  no  survival  out  of  an  estimated  total  of  240,000,000 
organisms  in  replicated  trials.  Referring  to  the  Poisson,  it  is 
found  that  populations  averaging  3  will  giva  an  occasional  zero;  with 
means  of  4  or  more,  zero  is  rare.  Thus,  a  tentative  maximum  of  3 
passing  per  240,000,000;  or  1  for  80  million,  is  reached.  If  the 
slrvivors  average  1  in  80  million,  and  there  are  only  80  organisms 
in  the  material,  the  chance  of  only  1  in  1  million  is  tentatively 
reached.  Other  good  filters  gave  occasional  survivals  of  1  or  2, 
and  seemed  to  be  in  the  same  class. 

The  material  seems  likely  to  have  much  more  than  80  in  a  typical 
sample.  The  solution  reached  was  to  use  a  second  filtration,  which 
would  seem  to  give  ample  assurance.  This  second  filtration  also  aids 
in  the  question  of  possibly  defective  filters.  An  occasional  defective 
filter  in  a  disposable  lot,  or  a  proven  but  deteriorating  filter  from 
a  reusable  lot,  is  to  be  avoided.  The  second  filtration  with  new 
filters  from  good  lota  seams  to  reduce  this  hazard  to  insignificance 
without  an  inordinate  amount  of  work. 

Another  case  of  use  of  very  small  probabilities  is  given  by  Baker, 
in  the  case  of  fruit  sterilization  by  moderate  heat,  to  kill  fruit  fly 
stages.  This  was  in  quarantine  work.  Populations  were  estimated  by 
rearing  the  adults  out,  both  in  a  check  sample  and  in  treated  samples. 

A  graded  series  of  time  exposures  was  used,  and  time  was  treated  as 
dosage  in  a  probit  analysis.  Several  thousand  individuals  per  dose 
were  used,  and  probit  values  up  to  more  than  8  were  secured.  The 
lines  were  extrapolated  to  estimate  dosage  required  for  9  probite 
<3  survivors  per  100,000),  which  the  author  believed  to  be  an  acceptable 
risk. 
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RESEARCH  AND  DEVELOPMENT  MATHEMATICAL  EQUATIONS 
AS  RELATES  TO  AN  ARM?  AIRCRAFT  SYSTEM 


Tony  N.  O' Truk* 

U.  S.  Army  Aviation  Matariel  Counand 
St.  Louis,  Missouri 


ABSTRACT.  This  papar  covets  tha  life  cycla  of  the  Research 
and  Developmental  Phase  of  an  Army  aircraft  system.  It  also  covers 
the  preparation  of  mathematical  equations  as  pertains  to  the  hardware 
under  the  prototype  aircraft,  as  well  as  the  training,  maintenance 
support,  and  administration  of  the  prototype  aircraft  system. 
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HYPOTHESES  TESTING  AND  CONFIDENCE  INTERVALS  FOR  PRODUCTS  AND  QUOTIENTS 
OF  POISSON  PARAMETERS  WITH  APPLICATIONS  10  KB.  Li  ABILITY 

Bernard  Harris 

Mathematics  Research  Center,  United  States  Army, 

The  University  of  Wisconsin,  Madison,  Wisconsin 

are  +  k2  mutually  Independent 
Poisson  random  variables  with  parameters  A^,A2,..,Ak  ,p^,P2>..,Mk 
respectively.  Confidence  intervals  and  testa  of  hypotheses  for  the 
parameter  0  *  ^2  '  * '  \  ^  W1  u2  "  ‘  *  ^k  are  Under 

suitable  conditions  these  procedures  may  be  used  to  obtain  approximate 
confidence  intervals  and  testB  of  hypothssts  of  the  parameter 
p-  Plp2  ...  pki  /  pk^  +1  pki  +2  ...  pki  ^  ,  where  the  pt'., 
i  -  1,2,...,  kj+k2  are  binomial  parameters.  This  problem  Is  of 
importance  in  reliability  analysis  and  some  applications  to  reliability 
analysis  are  exhibited. 


ABSTRACT.  X^X^..,^  .Y^Y^.., 


The  remainder  of  this  article  has  been  reproduced  photographically 
from  the  author's  copy.  It  was  issued  by  the  Mathematics  Research  Center 
as  MRC  Technical  Summary  Report  No.  923. 
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HYPOTHESES  TESTING  AND  CONFIDENCE  INTERVALS  FOR  PRODUCTS 
AND  QUOTIENTS  OF  POISSON  PARAMETERS  WITH  APPLICATIONS 

TO  RELIABILITY 

Bernard  Harris 

1.  Introduction  and  Summary.  Let  X.,,  . . . ,  X^  ;  Yj,  Y2,  . . . ,  Y^  be  +  k2 

i  2 

mutually  independent  Poisson  random  variables  with  parameters  . . . ,  » 

(i  ,  p  ,  . . .,  p  respectively.  In  this  paper,  we  obtain  confidence  intervals  for 
1  2  k2 

the  parameter  e  =  W2  •  “  \  /Vj P2  •  •  •  and  the  corresponding  tests  of 

1  2 

hypotheses.  The  required  theoretical  development  is  given  in  section  2.  In 
section  3,  we  examine  the  particular  case  =  2,  k2  =  0  because  of  the  specific 
nature  of  the  answer  obtained  in  this  case.  In  section  4,  some  of  the  concrete 
situations  which  lead  to  this  problem  are  pointed  out  and  some  numerical  illus¬ 
trations  are  given.  In  particular,  the  reader  should  note  that  for  k2  =■  0,  the 
parameter  0  is  a  product  of  Poisson  parameters  and  the  solution  to  the  present 
problem  can  be  interpreted  as  an  approximate  solution  to  the  corresponding  problem 
of  finding  confidence  intervals  for  the  product  of  binomial  parameters.  Estimation 
of  the  product  of  binomial  parameters  has  been  investigated  by  A.  Madansky  [  2] 
and  R.  J.  Buehler  [1],  Their  results  and  methods  will  be  compared  with  those  of 
the  present  paper  in  section  4. 

2.  Determining  Confidence  Intervals  for  8  .  The  joint  distribution  of  X^,  X2,  .  . . , 
Xk^:Y1,Y2,  ...,Yk^  is  given  by 

Sponsored  by  the  Mathematics  Research  Center,  United  States  Army,  Madison, 
Wisconsin,  under  Contract  No.  :  DA-31-124-ARO-D-462. 
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(1)  P  (x.,  x . x.  :  v. .  v . v  :  \  .  \  . ...  \  *ii  u 


1  ‘  r  c'  "ic2  r  i‘ 


•11  LI  11  \  — 

V'l"2'  'k2' 


1  C. 

1  \  -  1  h  ki  k2  y 

1=1  J=1 


where  ^>0,  Xj  ==  0,1,2,...,  i  =  1,2,...  ,^5  ^>0,  Vj  -  0,1,2 . J  =  1,2,..., 

Assume  >  0  .  Then,  let  Uj  *  Xp  and  for  i  =  2,  3, . . .  ,]y  define 

VVX1!  for  j  -  ] ,  2, . . . ,  k^,  define  *  Y  +  .  The  joint  distribution  of 

Uj,  U  2,  . . . ,  .U^  ,  Vj,  V2>  . . . ,  .V  is  then  given  by 
^  2 

(2)  P1(u1,U2, .  .  .  ,u.  I  v!»v2»  *  •  •  »V]<  5  xi»x2»  *  *  ’,  \  »  **l»  ^2*  * 1  ’  •  *  " 

kj  kj  1  2  1  2 

(  u.  \/11L  u +u,  \/ k2  v -u. 

e  '  (Xl  /u>l)(,Ir2"‘  /A,i+ui)l](JI[h  /<vu.)lj  - 

Uj  =0,1,  2,  . . . ,  =  -u^-Uj+l,  -Uj+2,  . . . ,  i  =  2,  3, .  . .  ,kjiVj  bu^Uj+1,  Uj+2,  .  . . , 

J  =  1»  2, . . . ,  k2  • 

Consequently,  the  conditional  distribution  of  Uy  given  U2  =  u2,  U3  suJ(  . , . , 

v “ v,!Vi “ vi»  vv'-yv  ia 


.  . .  k„  . 


1  1 


2  2 


CO  Pe(u1lu2,u3,...uk  ,v1,v2,...,vk  )  =- 

1  2l 


e  h  u3»  •  •  • » uk  ,VlfV2f##*Vk 

_ _ _ _ _ 1  2 

kl  k2 

u,  I  TT  (u.+u.)!  TT  (v  -u.)  ! 

1*2  J=1  }  1 


where  max(0,  max  (- u  ))  <  u.  <  min  v.  and 
2<i<kt  l<j<k2J 
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kj  k, 

(4)  h(u  ,u . u  ,v  v  ...,v  ;6)  =  ^  e^r!  TT  (u  +r)l  TT  (v -r) !)  , 

5  1  k2  r  i=2  1  1=1  J 

the  sum  running  from  max(0,  max  (-u  ))  to  min  v.  .  In  particular,  note  that 

2<i<kj  l<J<k2 

the  probability  distribution  (3)  depends  only  on  e  and  not  on  the  individual  V's 
and  p's  .  Since  the  probability  distribution  (3)  is  a  member  of  a  one-parameter 
exponential  family,,  one  and  two-sided  tests  of  size  a  of  hypotheses  concerning 
0  can  be  written  down  as  follows. 

To  test  Ht0  =  ©Q  ,  against  alternatives  0>0Q,  reject  H  if  Uj  =  k  and 

<5)  ^uPeJUllu2’U3,'",Uk,,Vl'V2'"’’Vk,)-“  ' 

u£k  0  1  2 

To  test  H:0  =0Q  against  alternatives  0<6O,  reject  H  if  IJj  =  k  and 

(6)  Zi  Pq  (ui  lu2*ui»  1  *  *  »uk  »  vi»  v?»  '  ‘  *  »  vv  )  5.®  * 

ut<k  eo  *1  2 

To  test  H:0  =  0Q  against  the  alternative  0*0^,  reject  H  if  Uj  =  k  and 

either 

>'7»l  l  P9  (ulU2,u3,...,u  ,  »1,V2 . *k  )<»/3 

Uj<k  0  1  2 

or 

(7b)  L  P0  (u  |u2,u  ,...,u  ,  v,,v2,...,vk  )  <a/2  . 

u£k  0  1  2 

The  tests  given  by  (5)  arid  (6)  are  uniformly  most  powerful  similar  tests. 

The  test  given  by  (7)  is  similar,  but  in  specifying  the  right  hand  sides  of  both 
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(7a)  and  (7b)  as  a/2,  this  will  generally  not  be  a  uniformly  most  powerful  similar 

test.  This  choice  is  suggested  for  ease  of  computation,  since  the  "optimai" 

choices  for  the  right  hand  sides  of  (7a)  and  (7b)  will  depend  on  (u2,u3, . . .  ,  u^  , 

Vj,  v2, . . . ,  v^  )  .  It  should  be  noted  that  since  p  (Uj  lu2,  u3, . . . ,  ,  Vj,  v2, 

. .  . ,  v  )  is  discrete,  the  tests  given  above  actually  are  tests  of  size  not  ex- 

*2 

ceeding  a  .  In  order  to  produce  tests  of  exact  size  a,  randomized  tests  will 
usually  have  to  be  employed.  The  required  modifications  can  easily  be  carried 


Confidence  intervals  of  confidence  coefficient  1  -a  can  be  easily  obtained 
for  each  of  the  above  tests. 

Upon  observing  =  k,  the  1  -a  upper  confidence  limit  02  for  0  con¬ 
ditional  on  U2=u2,  V<u3,  •••,Uk-uki,Vl-v1,V2-v2,  ....V^-v^  is 

(8)  e2  =  sup  {e  :  P  (uj  I  u2»  u3»  *  *  * » uk  ,  v,,v2,...,vk  )  >  a}  . 

ut<k  1  1  2 

Similarly,  the  corresponding  lower  confidence  limit  0^  for  0  after  observing 
Uj  =  k  is 

(9)  0j  =  inf{0  :  ^  P0  (u1lu2,u3, . uR  ,  V],v2,. .  .,vk  )  >  a>  . 

Uj_>k  1  '2 

From  (7a)  and  (7b),  we  can  obtain  a  two-sided  1-a  confidence  interval 
upon  observing  =  k  by 

(10)  P{©  (k)  <9  <0?(k)}  >  1  -  «  , 


where 
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(10a) 


and 


fi .  fkl  =  inf (e : 

i 


V 


D-  (U. 
t)  1 


u„.u_. 


v,  )  >  a/ 2} 
n2 


(10b)  e2(k)  =  sup{6  :  \j  pQ(u1|u2,u3,...,uk  ,  v1,v2,...,vk  )  >«/2)  . 

“1  — 1 

If  kj  =  0,  then  0  =  (n^.  . .  )  and  upon  defining  6=6  ,  the  pre¬ 

ceding  tests  and  confidence  intervals  ( (6)  through  (10))  are  readily  transformed 
to  provide  the  corresponding  results  for  this  case.  That  is,  let  =  Y^U^Y^Yj  , 
i  =  2,  3, . . .  ,k2  .  Then,  in  precisely  the  same  manner  as  before,  we  obtain  tests 
for  6  and  confidence  intervals  for  6  which  are  completely  equivalent  to  tests 
and  confidence  intervals  for  6  .  These  tacts  are  briefly  summarized  below. 

The  conditional  distribution  of  Uj  given  U2  =  u2,  U3  =  u3>  . . . ,  Uk  =  uk 

2  2 
is 


#U1  -1  >:< 
6  hl  (u2,u3’ ,,,,uk  !®  > 


(11)  p  *(u1Iu2,u3,..  ,,uk  )  = 
6  2 


,  u1  >  max(0,  max(-u1))  , 


V  jh  (ui +  ui>' 

i=2 


2^<k2 


where 


(12) 

the  sum  running  from 
hypothesis  H:0  =  60 
H  if  =  k  and 


VW  ' '  ,,uk  :9  }  =  L  6  /[rl  TT  (ui  +  r) !]  , 
*2  r  i=2 


max(0,  max  (-u.))  to  »  .  Then  a  size  a  test  of  the 
2<i<k 

against  the  alternative  6  <  6^  is  given  by  the  rule:  reject 


426 


p  *  <u]l vuv->uif  }  <a 


ui-*  °o 

*  -1 

where  9Q  •-  0 

Similarly,  a  test  of  the  hypothesis  H:0  «  0Q  against  the  alternative  6  >  0Q 
is  given  by  the  rule:  reject  H  if  =  k  and 

(n>  L  p  *  )  <«  ■ 

ul<k  «0  2 

Finally,  to  test  H:0  =  0Q  against  the  alternative  0*0Q,  reject  H  if 
Uj  3  k  and  either 


L  P  *  (u.|u  ,u  , .,u  )  <  a/2 


Vk  e0 


05b)  l  p  (u  |u  ,u  ,  . .  .  ,u  )<a/2  . 

1  3  "2 

Upon  observing  Uj  =  k  the  1-a  upper  confidence  limit  02  for  0  conditional  on 

U2  =  U2»  U3  *  u3* '  “  '  Uk  =uk  is  given  by 

2  2 

(16)  0 =  inf  {e V :  £  P  *  (u  lu2,u  , . .  .  ,u  )  >  <*}  , 

Uj>k  e  2 

and  02  =  l/0  ^  . 

Similarly,  the  corresponding  lower  confidence  limit  0j  for  0  after  observing 


Uj  =  k  is 


0*  =  sup{0*:  £  P*  (ujlu2»u3,...,uk  )  >a}  , 


u^k  0 


and  0j  =  l/©2  • 


Finally,  the  two-sided  1  -  a  confidence  interval  upon  observing  U,  =  k  is 

(18)  P{61(k)  <0  <e2(k)}  >  1-a  , 
where 

(18a)  0 J"1  (k)  =  sup{e*:  £  P  ^,(Uj  lu2» u3»  •  •  •  >\  )  >«/2) 

Uj<k  0  2 

and 

(18b)  e^fk)  =  inf{0*:  ^  p  I  u2>  u3>  .  .  .  ,  uk  )  >  or/ 2}  . 

Uj>k  0  2 

Remark .  When  kj  =  0  we  could  also  have  proceeded  by  letting  =  -Y^,  = 

Yj  -  Yj,  j  ■  2,  3, . . .  jkg!  then  the  conditional  distribution  of  Vj  given  V2  =  v2  , 
V  *  v  ,  .  . . ,  V  =  v  would  depend  on  p.,  p_,  . .  . ,  p.  only  through  0  .  The 

j  j  it  k  2 

tests  and  confidence  intervals  obtained  by  repeating  the  analysis  leading  to  (3) 
through  (10)  would  give  precisely  the  same  results  as  (11)  through  (18). 

3.  Tests  and  Confidence  Intervals  for  the  Product  of  Two  Poisson  Parameters.  In 
this  section  we  exhibit  some  specific  properties  of  the  particular  case  kj  =  2  , 
k2  =  0;  that  is  0  =  X^  .  In  this  case, 

u, 

(19)  p0(u1lu2)=e  /(Ujl  (Uj  +  U2)  1  h(u2;0)),  u}  >  max(0, -u2)  , 
where 

00 

(20)  h(u2;0)  =  Yi  e^(r!  (u2  +  r)  !)  . 

r=max(0,-u2) 


428 


Define 


*  II. 


I  (t,K)  =  (k/2)V  j, 

k=olc!(v+k,!  * 

where  1^(00,  k)  =  ^(k)  is  the  modlfied  BesseI  function  cf  order  v 
Then,  if  u2  >  0  , 

£  r  -u_/2 

h(u2;e)  =  ^  0/(r!  (u. +r) !)  =  6  I  {2\[q  ) 

r=0  u2 

and 


t  u. 


u2/2 


I  6  /^1  (U  +u2)  !)=e  2  I  ct, 2 n/©"> 

u,=0  6  u? 


Thus,  for  t  an  Integer  _>  0  , 


(21) 

Similarly,  if  u2<0,  let  v  a  -u2  ;  then 


Pe<U1<t,U2*U2)  Blu  (t»  }/lu 

2  u2 


x 


cc 


h(u2:0)  =  £  e^frl  (r-v)  I)  B  £  0r+V/(r!  (r+v) !)  =ev/2i 

r=0  ' 


r=v 


Further 


h  6  Vtijl  (u  -v) !)  =  £  er+7(r!  (r+v)!)  =  eV//2I  (t-v 
Uj=v  r=0  v 


Thus,  for  t  an  integer  >  -  u  , 


(22) 


P0{Ul-tfU2  =U2}  sI.u  <t  +  V2V6)/ln  (2^M 

2  "u2 


* 

■»« 


is 


(2\/'e)  . 


2\/? )  . 
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where  t  is  an  integer  >  0,  if  u2  >  0,  and  an  integer  >  -u2,  if  u2  <  0  . 

It  seems  natural  to  name  this  distribution  the  "incomplete  modified  Bessel 

function:. 

Returning  to  the  tests  and  confidence  intervals  given  earlier,  the  1  -  a  upper 
confidence  limit  02  for  6  conditional  on  U2  “  u2  may  be  written 

(24a)  e.-supfcsl,,  (2n/g)>«}  ,  u  >0 

.  u2  U2 

and 

(24b)  e,ssup{e;I  (k  +  u_,  2n/7  )/l  (2V?)>ft},  u,  <  0  , 

2  “u2  2  "u2  2 

where  k  is  the  observed  value  of  . 

The  other  confidence  Intervals  and  tests  given  in  (5)  to  (10)  admit  of  similar 
representations,  which  will  not  be  explicitly  given  here.  In  this  case,  it  is  also 
quite  reasonable  to  tabulate  this  distribution  and  we  hope  to  produce  such  a  tabu¬ 
lation  in  the  near  future. 

4.  Applications  .  Despite  the  fact  that  the  problem  of  hypothesis  testing  or  con¬ 
fidence  intervals  for  the  parameter  e  =  \^X2  ...  p2  .  .  .  ,  where 
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\  ii  ^  -  1  7 

1'  '  *  *>  ->  * 


U  4-11 

*  *  t  •»  -9 


),  i  ~ _ — _ _ _  -.  *. _ _  _ j  —  - 

•  f  ^  ^  vAiv>  vu^ii  i  wi^dwu  pai  amckci  aj  may  auoc 


as  a  problem  of  interest  in  its  own  right,  the  procedures  described  in  this  paper 
may  be  of  more  interest  and  will  presumably  be  applied  more  often  as  approximate 
techniques  for  statistical  inference  questions  concerning  products  and  quotients  of 
binomial  parameters.  We  proceed  to  give  some  illustrations  of  this  usage.  Through¬ 
out  the  subsequent  discussion  we  shall  assume  that  the  relevant  parameters  of  all 
binomial  distributions  being  considered  are  such  that  the  Poisson  approximation 
to  the  binomial  distribution  is  satisfactory  to  the  user. 

Consequently,  assume  that  we  have  kj  +  k.,  binomial  populations  with 
parameters  Oyp,),  <V»2> . >,  Py,) ,  . 

(nk  +k  ,  Pk+k  )  respectively  and  that  the  mutually  independent  binomial  random 
12  12 

variables  Xj,  X2,  .  . . ,  +j,  +2 , . . .  ,5^  +  have  been  observed.  Then,  let 

111  12 

PlP2-'*Pk 

(25)  p=- - = - “ - ,  p  >0,  l.-l,2,...,k  +  k  . 

Pk1+lPk1+2-*,pk1+k2  1  1 


Replace  nJpi  by  i  *  1,  2, . . .  ,kj,  and  for  i  =  k^l,  kj+2,  . .  . ,  kj  +  k?  replace 

nipi  by  where  •  Then,  assuming  that  Xj,  X2,  . . . ,  X^ are 

1  2 

each  approximately  Poisson  distributed,  we  have  from  (10), 


P  {Oj  <  6  = 


W-  -  \ 

- - -  <  9,}  ~  1  -  a 

^1^2’ ’ ' ^k2  ^ 


This  is  equivalent  to 


Vk2 


p{e.  c  TT  ti  p  /  TT  n  p  <  e,}  ~  l  -a 

*  l  —  1  1  *  i  I  I  *  *  ^ 


i*k  +1 
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1 


and  from  ( 27) ,  we  obtain  an  approximate  confidence  interval  for  p  by 


12  1  12  *1 

(28)  P{6.  TT  n/Tfn,<P<e?  TT  n/7Tn.}~l-«  . 

1  i=k1+l  1  1=1  i=k1+l  1  i=l  1 

The  process  for  getting  approximate  upper  (lower)  confidence  limits  for  p  is 
quite  similar  to  the  derivation  of  (28)  and  will  not  be  explicitly  stated  here.  In 
addition,  in  testing  hypotheses,  we  clearly  have  that  a  test  of  any  hypotheses  con¬ 
cerning  8  is  an  approximate  test  for  the  corresponding  hypotheses  for  p  . 

We  now  turn  to  some  concrete  illustrations. 

In  reliability  analysis,  a  mechanism  may  fail  if  and  only  if  each  of  k  com¬ 
ponents  fail.  Let  Ei  be  the  event  that  the  ith  component  fails,  i  =  1,2, ...  ,k 

and  assume  that  the  event3  E  are  mutually  independent.  Then,  the  probability 
k  k  1  .  k 

of  failure  =  P(H  E  )  =  TT  P(E.)  =  TT  p.  .  If  each  component  is  tested  separately 
i=l  .1=1  1  i=l  1 

in  n^  Bernoulli  trials,  and  if  the  p  's..  are  "small"  and  the  n^s  are  "large", 
then  (28)  or  the  equivalent  formula  for  upper  (lower)  confidence  limits  for  p 
applies.  For  this  problem  R.  J.  Buehler  [1  ]  gave  a  procedure  employing  a  Poisson 
approximation.  However,  Buehler* s  procedure  does  not  readily  extend  to  products 
of  more  than  two  binomial  parameters  without  introducing  extensive  computational 
difficulties.  On  the  other  hand,  for  k  >  2,  the  series  (4)  introduced  in  this 
paper,  whose  individual  terms  give  the  conditional  distribution  (when  normalized 
by  (4)),  converges  more  rapidly  than  the  exponential  series  and  can  be  easily 
evaluated  in  any  specific  case  by  hand  computation.  The  individual  terms  can 
each  be  computed  recursively,  a.  Madansky  [2]  employed  the  likelihood  ratio 
statistic  L(p)  and  used  the  approximate  distribution  theory,  namely  that 
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-■£  logL(p)  has  asymptotically  the  x^  distribution  with  one  degree  of  freedom. 

He  compared  this  with  the  approximate  confidence  regions  that  would  be  obtained 
by  "linearization"  methods.  Madansky  also  noted  that  the  application  of  the 
asymptotic  distribution  theory  for  either  the  likelihood  ratio  statistic  or  the  "linear¬ 
ized"  statistic  is  not  too  satisfactory  for  the  case  of  very  high  reliabilities.  How¬ 
ever,  this  last  concern  is  precisely  what  motivated  the  present  investigation. 

To  see  how  one  may  obtain  ratios  (k.,  >  0),  we  state  the  specific  problem 
which  was  posed  to  the  author.  Let  E,,  E-,  E  ,  E  be  arbitrary  events.  A  con- 

«  i  W  J  T 

fidence  interval  for  P(E2  fl  De^Ie^)  is  required,  which  we  write  as 

(29)  P(E2nE3nE4lE1)  =  P(E1  0E2)  P(E3lEjnE2)  PfE^Ej  fl  E2  (1  E^/PfEj)  . 

Separate  sequences  of  Bernoulli  trials  are  conducted  for  each  of  the  four  factors  in 
(29) .  Thus,  we  seek  to  obtain  a  confidence  interval  for  a  parameter  of  the  form 
p  -  P2P3P4/P]»  and  (28)  applies.  In  this  illustration,  we  have  k2  =  1;  clearly, 
the  above  illustration  can  be  extended  to  exhibit  experiments  with  other  values  for 

Experiments  such  as  the  type  leading  to  (2  9)  are  useful  in  situations  re¬ 
quiring  very  high  reliability,  Inasmuch  as  the  conditioning  appearing  in  terms  like 
P(E  ,|E,  H  E  D  E  )  may  be  needed  in  order  that  the  probability  of  occurrence  of 
a  failure  will  be  sufficiently  high  so  that  a  failure  may  be  observable  in  a  moderate 
number  of  trials.  In  addition,  this  type  of  experiment  may  also  be  used  to  eliminate 
the  need  for  assuming  independence  in  reliability  problems,  However,  it  does 
introduce  the  difficulty  of  requiring  conditional  experiments. 
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METHODOLOGY  OF  ASSESSMENT  OF  BIOCELLULAR  PERFORMANCE 


George  I.  Levin 
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ABSTRACT .  Our  laboratory  is  interested  in  problems  which  ere 
concerned  with  the  assessment  of  the  effect  of  absorbed  energy  on  the 
efficiency  of  performance  of  bio-cellular  systems  as  modified  by  the 
absorption  of  external  energy.  The  type  of  specific,  non-destrictive 
analytical  procedures  which  are  designed  for  this  purpose  and  which 
have  been  the  subject  of  previous  presentations  to  The  Design  of 
Experiments  in  Army  Research  Development  and  Testing,,  are  listed  below. 


Microscopy  -  A  spectrum  line  (Mercury  2537  AD)  is  used  es  the  light 
source  -  for  better  optical  resolution. 

Spectroscopy  *  A  continuous  light  source  (hydrogen  discharge  tube) 
together  with. a  spectrograph  of  low  dispersion.  The  combination  allows 
the  detection  and  identification  of.  large  molecules  in  e  mixture. 

MicrospectroBcopy  -  Both  aources  are  used.  The  line  source  for 
misepscopical  structure.  The  cpntinuous  source  brings  our  absorption 
band1  details  which  is  heeded  for  compound  (amino  acids  etc.)  differentia¬ 
tion  and  Identification. 

Model  Simulation  -  A  three  dimensional  model  is  described  which 
simulates  the  action  of  an  animal  which  senses  the  presence  of  sn  object 
and  than  reaches  for  it.  The  many  unrealities  of  teak  performance  of 
this  model  are  pointed  out.  These  include  the  leek  of  biochemical 
reality  which  means  no  biochemical  feedback  with  no  replacement  of 
material  as  action  performance  continues. 

Biochemistry  of  Tissue  Syatoms  -  The  relationship  of  specific  task 
performance  to  the  chemical  composition  of  the  particular  tiasua  system. 
Subjects  considered:  Proteins,  Nucleic  Acids,  Lipoids,  Carbohydrates , 
Polysaccharides,  Enzymes,  etc. 

Bionics  and  Cybernetics  -  A  consideration  of  the  application  of  systems 
analysis  in  relation  to  animal  performance.  Feedback  effects. 

Mechanism  of  Energy  Absorption  by  Cellular  Systems  -  An  analog  ia 
drawn  between  the  origin  of  optical  spectra  and  the  amount  of  energy 
absorbed  by  the  system  on  exposure  to  ultraviolet,  visible  or  infrared 
radiation. 
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Consequences  of  Energy  Absorbed  by  Biocellular  Systems  -  Initiation 
of  atom  and  free  radical  chain  reactions  which  recult  in  the  formation 
of  wound  tracts  and  stress.  Levels  of  damage. 

The  last  presentation  was  a  summary  of  the  above. 
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MONTE  CARLO  INVESTIGATION  OF  THE  ROBUSTNESS  OF  DIXON'S  CRITERIA 
FOR  TESTING  OUTLYING  OBSERVATIONS 
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Aberdeen  Proving  Ground,  Maryland 


ABSTRACT.  An  investigation  of  the  effect  of  non-normality  on  the 
distribution  of  Dixon's  criteria  for  detecting  outlying  observations  is 
presented  here.  Monte  Carlo  techniques  were  used- to  determine  the 
distributions  of  the  Dixon  statistics  when  observations  are  selected 
from  specific  non-normal  distributions  with  varying  degrees  of  abnormality. 
Two  such  distributions  whose  degree  of  abnormality,  as  determined  by  the 
coefficient  of  skewness,  may  be  varied  by  changes  in  the  parameters  of 
the  distributions  are  the  beta  and  gamma  distributions. 

A  measure  of  the  lack  of  robustness,  that  is  the  sensitivity  to 
departures  from  normality,  in  the  Dixon  criteria  may  be  determined  by 
comparison  of  the  frequency  distributions  of  the  Dixon  type  statistics 
computed  from  sampling  the  non-normal  distributions  with  those  values 
obtained  by  Dixon  when  sampling  from  the  normal  distribution. 

Based  on  the  distributions  of  the  Dixon  statistics  computed  from  the 
non-normal  distributions,  it  has  been  shown  that  Dixon's  criteria  is  not 
robust  and  its  wide  use  may  result  in  incorrect  decisions  when  the  under¬ 
lying  distribution  is  asymmetric  or  skewed. 

I.  INTRODUCTION.  After  experimental  data  has  been  collected,  and 
before  it  can  be  analyzed,  the  observations  must  be  carefully  screened 
to  determine  if  they  come  from  the  same  population.  If  any  of  these 
observations  appear  to  be  radically  different  from  the  majority  of  the 
other  values  obtained  in  the  experimentation,  it  is  necessary  to  deter¬ 
mine  if  the  suspect  value  is  an  extreme  value  or  an  dutlying  observation 
(commonly  called  an  outlier).  By  an  outlier,  we  mean  an  observation  that 
did  not  come  from  the  same  population  as  the  remaining  values.  In  order 
to  do  this,  a  knowledge  of  the  testing  procedures,  the  manner  in  which 
the  data  was  collected  and  recorded,  and  some  prior  knowledge  as  to  what 
the  range  of  the  observations  should  be,  are  very  helpful  in  deciding 
whether  a  value  should  be  retained  in  the  analyses  or  be  thrown  out  as 
an  outlier. 

To  be  consistent  in  this  process,  statistical  procedures  have  been 
developed  to  determine  whether  a  value  is  an  outlier  or  not.  One  of 
these  procedures  was  developed  by  W.  J.  Dixon  (1).  Dixon's  statistics 
have  the  advantage  of  being  easily  computed  and  are  thus  widely  used  in 
applied  statistics.  However,  Dixon's  statistics  were  developed  for 
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normally  distributed  variates.  The  question  was  posed  as  to  whether  or 

nof  Hi  Jtnn  '  s  f-pgr-Q  for  Oil) r  1  f  ora  uars  t-r»K  110 r  rnaho  Py  rM O  M O 

are  the  tests  insensitive  to  deviations  from  normality.  In  order  to 
check  the  robustness  of  Dixon's  test  statistics,  the  coefficient  of 
skewness  was  chosen. to  measure  the  degree  of  departure  from  normality. 
Two  distributions  whose  coefficients  of  skewness  may  be  varied  by 
changes  in  the  parameters  of  the  distributions  are  the  beta  and  gamma 
distributions.  Thus,  these  two  distributions  were  chosen  to  be  used 
in  this  paper. 


II.  TEST  OF  ROBUSTNESS  OF  DIXOfl'S  CRITERIA. 


2.1  Definitions  of  Statistics  to  be  Investigated.  The  four  statistics 

proposed  by  Dixon  for  testing  extreme  values  are  defined  below,  where 

the  X's  are  the  observed  values  from  a  normal  distribution  arranged  in 

ascending  order  such  that,  X.  <  X„  <■  X, . <•  X  ,  <•  X  . 

1—2—3  —  n-1  —  n 

For  a  single  outlier, 


or  for  a  single  outlier, 


For  a  single  outlier  X^,  avoiding)  Xn 


[la] 


[lb] 


[2a] 


or  for  a  single  outlier  X^,  avoiding  X^ 


[2b] 


For  outlier  A^,  avoiding  X£  and 


21 


X3-X1 


X  ,  -  X. 
n-1  1 


[3a] 


or  for  outlier  X  ,  avoiding  X,  and  X  . 

n  I  n-i 


21 


X  -  X  , 
n  n-2 


Xn“X2 


[3b] 


For  outlier  X, ,  avoiding  X. ,  X  ,  and  X 
1  2  n-1  n 


r22  “ 


X3-X1 


Xn-2  '  X1 


I 

X 


[4a] 


or  for  outlier  X  ,  avoiding  X. ,  X„ ,  and  X  . 

n’  0  1’  2  n-1 


22 


X  -  X  „ 
n  n-2 


Xn"X3 


[4b] 


These  computations  are  widely  used  in  applied  statistics .  One  of 
the  main  advantages  in  using  these  statistics  is  the  ease  with  which 
these  tests  for  outliers  may  be  performed.  It  is  a  simple  matter,  especially 
for  small  samples ,  to  visually  order  the  data  such  that  the  values  needed 


for  the  test  statistic,  i.e.,  X 


1’ 


x2,  X 


h-1*  >:n’ 


can  be  determined.  Then 


using  these  values,  r.  .  .  is  computed  and  compared  to  the  critical  value 

J  f  ±”1 

listed  in  tables  that  are  readily  available.  If  r,  ,  .  (the  computed 

J  » 

value)  is  greater  than  R  (the  critical  value),  at  the  desired  risk  level, 
a,  then  X^  (k  •  1  or  n)  is  determined  to  be  an  outlier  with  1  -  a  confidence. 

Since  Dixon's  critical  values  were  derived  using  the  normal  distribution, 
the  question  was  posed  as  to  how  departure  from  normality  would  affect  these 
critical  values.  In  order  to  investigate  this,  the  Pearson  Type  I  curve  (2) 
or  beta  distribution  was  chosen  as  the  underlying  distribution.  This  dis¬ 
tribution  was  used  with  various  a's  and  B's  to  give  distributions  with 
various  degrees  of  skewness. 
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uixon  computed  tne  aistnDucion  or  the  ratio,  r 
(Xn  -  X^)  /  (Xn  -  X^)  using  the  following  function: 


j,  i-1 


n! 


x-v 


,1-1 


(1-1)1  (n-j -i-1)! 

n-j-1-1 


f  (t)  dt  f (x-v) 


[5] 


j-1 


rx-rv 

1 

fx 

I  £  (t)  dt 

-'x-v 

# 

f  (x-rv)  f  (x) 

/  f  (t)  dt 

J  x-rv  v 

dvdx 

Where  j  -  1,  2;  1  -  1,  2,  3;  v  -  X  -  X. ;  rv  -  X  -  X. ;  X  -  X 
J  n  1*  n  j’  n 

i  -l 

f  (t)  -  _  «  2  . 


If  Instead  of  the  normal  distribution,  the 


beta  distribution  is  used  in  [5],  the  following  function  is  obtained: 

.1-1 


1 


// 

•J  n  ~  r\ 


nl 


o-'o  (i-1)  I  (n-j -i-1.)!  (j-1)! 

n-j -i-1 


( r 


£  (t)  dt 


f  (x-v) 


\r 


f  (t)  dt 


f  (x-rv)  f  (x)|  /  f(t)dt 
x-rv 


r 


\ 


j-i 


dvdx 


[6] 


Where  j-1,  2;  i-1,  2,  3 ;  v  «  X  -  X . ;  rv  ■  X  -  X. ;  X  ■  X  and 

n  l  n  j  n 

(  a  +  a  +  D! 


f  (t) 


a!  3! 


t°  (l-t)3. 


It  is  apparent  that  this  integration  is  very  difficult  for 
sample  sizes  of  n  -  3  and  becomes  more  difficult  as  n  increases.  In 
fact  Dixon  used  numerical  integration  for  only  a  few  sample  sizes  and 
interpolated  to  obtain  the  remaining  values.  Thus,  due  to  the  problems 
of  integration  and  the  fact  that  Mowchan  (3)  has  demonstrated  that 
using  Monte  Carlo  techniques  for  obtaining  the  distributions  of  the 

r.  's  were  very  accurate,  it  was  decided  that  Monte  Carlo  techniques 
j ,  i-1 

would  be  used  in  this  paper. 
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?  2  Monte  Carlo  Techniques.  In  uuiet  iu  use  nonce  Carlo  tecnmquee,  it 
was  necessary  to  draw  random  samples  from  the  beta  distribution.  Since 
beta  random  numbers  are  not  usually  readily  available  in  the  form  of 
subroutines,  the  following  method  was  used. 

The  Ballistic  Research  Laboratories  Electronic  Scientific  Computer 
(BRLESC)  at  Aberdeen  Proving  Ground,  Maryland  was  used  to  generate  a 
random  number,  y,  from  the  uniform  distribution  over  the  unit  interval. 
This  uniform  random  number,  y,  was  considered  to  be  the  srea  of  Interest 
from  a  cumulative  distribution,  F  (X),  The  cumulative  form  of  the  dis¬ 
tribution  was  integrated  from  0  to  X,  where  X  is  the  point  on  the  distri¬ 
bution  that  would  define  an  area  equal  to  y.  For  the  beta  distribution 
this  is  as  follows : 


F  (X) 


/ 


X  (  a  +  0  +  1)1 


al  6! 


V  1 


X  <r  0 


ta  (l-t)*5  dt  0  <  X  <  1 


X  >  1 


thus 


/. 


X  <a  +  0  +  1)  I 


o  al  61 


ta  (l-t> 8  dt  . 


This  procedure  for  generating  X's  was  used  to  obtain  samples  of 
size  n  **  6,  10,  and  15  for  this  paper. 


2.3  Determination  of  Critical  Values.  An  extreme  value  may  occur  as 
either  a  high  value  or  a  low  value.  Thus,  since  the  beta  distribution 
is  generally  not  symmetric,  it  was  necessary  to  construct  test  criteria 
for  testing  either  high  or  low  values.  To  do  this,  both  forms  of 
equations  [1]  through  [4]  were  used. 


Six  hundred  samples  of  size  n  were  drawn.  Each  sample  of  size  n  was 

.  .  <  X  .  Then  using  the  appropriate  X's 
n 


ordered  such  that  X^  <_  X^ 


the  test  statistics  were  computed  using  each  of  the  formulas  to  obtain 

the  r  ,'s.  After  600  r,  ,  's  were  obtained  for  each  test  statisti 
J  *  i**i  J  ^ 

the  cumulative  distribution  of  these  r  ,'s  was  constructed.  Various 

J  >  l"i 

percentiles  were  computed  ranging  from  10  to  99.5.  These  percentiles, 
along  with  Dixon's  percentiles  from  the  normal  distribution  (4)  are 
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given  in  tables  I  Ltnougli  XII.  These  percentiles  ace  given  in  tarns  o £ 
a  ,  where  a  Is  equal  to  one  minus  the  various  percentiles.  Thus  a  is 
equal  to  the  significance  level  of  the  test  at  which  the  suapect  outlier 
is  being  tested,  with  the  values  in  the  tables  being  the  critical  values 
at  the  given  significance  level.  These  critical  values  are  tabulated 
for  r^Q,  ,  r21  and  r22  for  both  upper  and  lower  tails  using  the 

following  parameters  of  the  beta  distribution  with  their  skewness 
coefficient,  aa  identification. 

a  3  Y1 

5  5  0 

7  A  -0.2 A 

8  3  -0.A2 

9  2  -0.6 A 

10  1  -0.96 

19  1  -1.14 


These  skewness  coefficients  were  computed  using  Y^  - 

Where  E  (X  -  p)3  -  p'3  -  3  u'  p'2  +  2  (  u^)3*  For  the  beta 

distribution 


E  (X  -  u)3  - 


f  (a  +  1)  (a  +  2)  (a  +  3) 


L  (a  +  B  +  2)  (a  +  B  +  3)  (a  +  B  +  4)  J 


and 


(a  +  1) 


(a  +  B  +  2) 


(a  +  ir 


(a  +  B  +  2)" 


(a  +  1)  (a  +  2) 


(a  +  1)  (a  +  2) 


(a  +  8  +  2)  (a  +  B  +  3) 


3/2 


(a  +  1)‘ 


(a  +  S  +  2)  (a  +6+3)  (a  +  B  +  2) 


AA2 


These  values  of  a  and  3  were  chosen  so  as  to  give  various  degrees 
of  skewness.  When  a  »  3  then  the  beta  distribution  is  symmetric  and 
the  skewness  coefficient  is  equal  to  zero.  In  order  to  minimize  computer 
time,  it  was  desired  to  keep  the  sum  of  if  3  as  small  as  possible,  since 
as  this  sum  increases,  so  does  the  computing  time.  However,  it  was  desired 
to  get  various  degrees  of  skewness,  thuB  a  was  increased  and  @  decreased. 

By  choosing  to  do  this,  negative  skewness  coefficients  were  obtained. 
Positive  skewness  coefficients  could  have  been  obtained  by  increasing  8 
and  decreasing  a.  However,  the  only  difference  a  positive  skewness 
coefficient  would  make  is  that  the  skewed  tail  would  be  on  the  right 
instead  of  on  the  left.  Thus,  if  the  skewness  coefficients  were  positive 
the  upper  and  lower  tail  values  would  be  reversed. 

Since  the  beta  variates  range  in  value  only  from  0  to  1,  the  question 
arises  as  to  how  sampling  from  a  distribution  v?hich  has  an  infinite  limit 
on  one  tail  would  affect  the  critical  values.  Thus,  the  Pearson  Type  III 
Curve  or  gamma  distribution,  which  has  as  its  limits  0  to  °°  ,  was  chosen. 

The  cumulative  distribution  for  the  gamma  distribution  is 


X  >  0 


with  a>  -  1  and  3  >  0.  Since  a  change  in  S  only  changes  the  scale  and 
not  the  general  shape  of  the  curve,  without  loss  of  generality  6  ■  1 
was  used  with  a  "0,  1,  2,  3,  4,  5.  was  computed  for  the  gamma 

E  (X  -u)3 

distribution  using  y^  *  - j - 

a 


where 


and 


E  (X  -u)3  =  (m  +  1)  (a  +  2)  (a  +  3)  - 
3<a  +  1)Z  (a  +  2)  +  2  (a  +  1) 3 


(a  +  1) 


3/2 
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r'&Mak; 


s 


Yn 

-k. 


1 

1 

1 

1 

1 

1 


0 

1 

2 

3 

4 

5 


2.00 

1.41 

1.15 

1.00 

0.84 

0.82 


The  same  general  techniques  described  previously  were  used.  Again 
600  samples  of  sizes  n  ■  6,  10,  and  15  were  drawn  from  the  gamma  dis¬ 
tributions.  Both  forms  of  equations  [1]  through  [4]  were  used  in 
computing  the  test  statistics  since  the  gamma  distribution  is  also 
usually  not  symmetric  but  skewed. 

The  cumulative  distributions  of  these  test  statistics  were  formed 
and  the  critical  values  were  computed  for  the  various  levels  of  con¬ 
fidence.  These  critical  values  from  the  normal  distribution  for  the 
same  levels  of  confidence,  are  given  in  tables  XII  through  XXIV. 

From  the  Dixon  Statistics  that  were  computed  using  the  beta  and 
gamma  distributions,  it  is  apparent  that  for  a  given  confidence  level 
the  critical  values  in  the  skewed  tail  (the  lower  tail  for  the  beta 
distributions  and  the  upper  tail  for  the  gamma  distributions)  increase 
as  the  absolute  value  of  Y.  increases.  Vice  versa,  in  the  tail  opposite, 
the  skewness,  the  critical Avalues  tend  to  decrease  as  the  absolute  value 
of  y^  decreases. 

The  reason  for  this  might  be  described  in  the  following  manner: 


Let  us  look  at  the  Dixon  test  which  uses  the  statistic  r^  where 


for  the  test  of  an  observation  that  appears  to  be 


larger  than  '.he  other  observations  in  the  sample.  It  is  obvious  that 
for  r^  to  become  smaller,  the  numerator  (the  difference  between  the 

largest  and  the  next  largest  observation)  must  become  smaller  faster  than 
the  denominator  (the  difference  between  the  largest  and  the  smallest 
observation) .  It  should  also  be  noted  that  for  the  beta  and  gamma 
variates  used  in  this  paper  the  absolute  value  of  increases  as  the 
variance  decreases. 
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Beta  Distribution 

2 

Gamma 

Distribution 

2 

Y1 

Y1 

0 

0 

.019 

0.82 

6.0 

-0.24 

.017 

0.89 

5.0 

-0.42 

.015 

1.00 

4.0 

-0.64 

.013 

1.15 

3.0 

-0.96 

.009 

1.41 

2.0 

-1.14 

.004 

2.00 

1.0 

This  can  be  intuitively  demonstrated  when  considering  the  fact  that 
as  the  skewness  increases,  the  distribution  becomes  clustered  at 
one  end  of  the  range  of  the  distribution,  with  only  a  small  portion 
of  the  distribution  lying  in  the  skewed  tail.  For  example,  using 
the  beta  distribution  with  a  large  skewness  coefficient,  let  the 
suspect  outlier  to  be  a  value  larger  than  the  other  sample  observations. 
Thus,  the  majority  of  the  values  are  generally  clustered  in  the  upper 
tail,  close  to  the  upper  limit  of  one.  It  would,  therefore,  be  very 
unlikely  for  the  difference  between  the  largest  and  the  second  largest 
observation  to  be  very  large.  On  the  other  hand,  since  the  skewed  tall 
of  the  distribution  goes  to  0,  it  is  likely  that  in  a  sample,  at  least 
one  of  the  observations  will  be  small  In  comparison  with  the  other 
samples.  Therefore,  when  a  distribution  is  markedly  skewed,  it  is 
expected  that  the  values  of  r.^  will  be  small  for  the  skewed  tail. 

The  critical  values  obtained  using  the  beta  and  gamma  distributions 
were  compared  to  Dixon's  critical  values  by  using  the  Kolmogorov.r.Smlrnov 
statistic  (5).  the  empirical  distributions  were  tested  against  those 
derived  by  Dixon  and  the  level  at  which  these  tests  of  equality  were 
rejected  is  given  in  tables  XXV  and  XXVI.  The  distributions  of  the 
rj  were  listed  as  not  significantly  different  from  those  obtained 

by  Dixon  for  the  normal  distribution  at  the  .10  level. 

It  can  be  seen  for  the  beta  distribution  variates,  that  the 
significance  level  generally  decreases  as  the  absolute  value  of  the 
skewness  coefficients  increase.  For  gamma  distribution  variates,  the 
significance  level  is  generally  .01  for  all  degrees  of  skewness.  Since 

the  distributions  of  the  r  ,'s  obtained  using  the  beta  and  gamma 

j  > 

distributions  are  significantly  different  from  those  obtained  by  Dixon 
using  the  normal  distribution,  some  examples  are  given  to  show  how  it 
is  possible  to  make  the  wrong  decision  in  deciding  whether  or  not  an 
observation  is  an  outlier. 

2.4  Examples .  Suppose  for  example  we  had  the  following  observation 
from  a  beta  distribution  with  a  akewnesa  coefficient  of  »  -0.42 
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.7453 


.2319 

X2  -  .6516 

.7555 

X5  -  .8547 

X6  -  .9690 


Let  be  our  suspect  outlier  and  using  r^Q  ■  '  9590  yj'i'g  “  0*569. 

Comparing  this  with  Dixon's  critical  value  of  0.560,  we  would  designate 
as  an  outlier  at  the  .05  level  of  significance.  However,  using  the 

critical  values  in  table  1,  Lower  Tail  under  y^  ■  -0.42,  we  see  that 

the  critical  value  is  0.6085  at  the  .05  significance  level.  Thus,  X^ 

would  not  be  an  outlier. 


As  the  second  example,  take  the  10  observations  drawn  from  a  beta 
distribution  with  y1  ■  -0.64 


Xx  ■  0.2306 

X.  -  0.4814 
4 

X?  -  0.6548 
X1Q  -  0.9701 


X2  ■  0.3312 
X5  -  0.5489 
Xg  -  0.6637 


X3  -  0.4317 
X&  -  0.5806 
X9  -  0.73626 


Using  r^  ■ 


Xi0  “  X9 


“10  “2 

see  if  it  is  an  outlier. 


as  our  test  statistic,  we  test  X^q  to 


.9701  -  ,7363 

r  ■  . - . -■ — ■  ■  ■  .3660 

LX  .9701  -  .3312 

Using  Dixon's  criteria,  X^q  would  not  be  an  outlier  at  the  .05 

significance  level.  However,  using  the  critical  value  listed  in 
table  VI,  Upper  Tail,  under  -0.64,  we  see  that  its  critical  value 
is  0.3466.  Thus,  X^q  would  be  an  outlier. 

For  example  three,  let  us  look  at  a  sample  drawn  from  a  gamma 
distribution  with  Y^  “  1.15. 


X1 

-  0.4790 

X2  -  0.9628 

X3 

1.4398 

X4 

-  1.8540 

X$  -  2.5660 

X6 

2.8963 

X7 

-  3.4193 

Xg  -  3.6188 

X9“ 

6.6278 

X1Q  -  9.0973 
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9.0973  -  3.6188 


0.6735 


Using  r„ 


10  8 
X,„  -  X„ 


to  test  X, 


we  get 


9.0973  -  0.9628 


"10  “2 

which  is  significant  at  the  .05  significance  level,  using  Dixon's  critical 
value  of  0,612.  However,  using  table  XIX,  Upper  Tail,  under  y^  ■  1.15, 

we  see  that  the  critical  value  at  this  .05  significance  level  is  0.7514. 
Thus,  would  not  be  an  outlier. 

As  example  four,  take  a  sample  of  size  n  *  15  from  a  gamma 
distribution  with  y,  *  1.15. 


X  =  0.2129 

X2  -  1.1867 

X3  “  ‘ 

X4  -  2.7486 

X5  -  2.8934 

X6“  ! 

Xy  -  3.4674 

XQ  -  3.6631 

X9 

X1Q  -  4.1009 

Xu  -  4.3123 

X12  “ 

X13  -  5.6396 

X  4  «  5.7802 

x3  -  xL 

X15  ■ 

Using  as  our 

suspect  outlier,  and  r,,  -  Y  v 

22  X13~  1 

we  get 

3.0924 

3.8998 

4.5184 


2.3271  -  0.2129 

t  *  “  -  0.3896.  Dixon's  critical  value  at  .05 

22  5.6396  -  0.2129 

significance  level  is  0.525.  Thus,  X^  would  not  be  an  outlier  at  the 

.05  significance  level.  However,  using  table  XXIV,  Lower  Tall,  under 
y^  ■  1.15,  we  see  that  the  critical  value  is  0.353.  Thus,  X^  would  be 

an  outlier  at  the  .05  significance  level. 

From  these  examples,  it  is  easy  to  see  that  there  are  two  types 
of  errors  that  can  be  made  if  the  sample  observations  are  not  from  a 
normal  distribution  and  if  Dixon's  critical  values  are  used  for  testing 
extreme  values.  These  values  can  be  called  outliers  when,  in  fact, 
they  are  not  outliers  at  the  chosen  significance  level  or  they  can  be 
outliers  at  a  chosen  significance  level  and  not  be  so  designated.  Thus, 
from  these  examples,  it  can  be  seen  that  the  Type  I  or  a  errors,  i.e., 
the  rejection  of  the  hypothesis  when  it  is  in  fact  true  and  the  Type  II 
or  8  errors,  i.e.,  the  acceptance  of  the  hypothesis  when  it  is  false, 
are  not  what  they  are  specified  to  be  when  operating  under  the  assumption 
of  normality  when  in  fact,  the  observations  come  from  a  non-normal 
dia  cribution. 
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III.  CONCLUSIONS .  It  has  been  shown,  on  an  empirical  basis,  and 
using  the  Kolmogorov-Smi rnnv  goodness-of-f it  test  that  there  la  a 

difference  in  the  cumulative  distributions  of  the  r.  .  ..  statistic 

J  »  i—l 

obtained  using  the  normal  distribution  as  opposed  to  distributions  that 
are  non-normal.  These  differences  are  usually  significant  at  a  low  risk 
ieve 1 . 


Also,  it  has  been  shown  that  the  effect  of  departure  from  normality 
is  dependent  on  whether  the  suspect  outlier  is  a  large  or  small  value. 
Thus,  it  is  necessary  to  have  critical  values  for  testing  either  large 
or  3mall  values. 


It  is  also  evident  that  the  degree  of  skewness  of  the  distributions 
affects  the  critical  values.  That  is,  these  critical  values  tend  to 
depart  more  from  those  values  derived  by  Dixon  for  the  normal  distribu¬ 
tion  as  the  skewness  Increases. 

For  a  symmetric  distribution,  that  is,  one  for  which  the  skewness 
coefficient  is  zero,  Dixon's  criteria  is  robust.  However,  as  the 
distribution  becomes  asymmetric  and  the  absolute  value  of  the  skewness 
coefficient  increases,  Dixon's  criteria  becomes  less  robust. 
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APPENDIX  A 

TABLES  OF  PERCENTAGE  POINTS  OF  DIXON'S  CRITERIA 
FROM  BETA  AND  GAMMA  DISTRIBUTIONS 
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TABLE  t 


BETA  DISTRIBUTION 
N  *  6 

f>fUr10  >R)  *  a 
UPPER  TAIL 


a/Yi 

D.C.* 

0.  CO 

<M 

• 

o 

1 

-0.42 

o 

• 

o 

1 

-0.96 

.900 

.038 

.0435 

.0378 

.0244 

.0243 

.0215 

.800 

.079 

.0809 

.0668 

.0530 

.0560 

.0453 

.700 

.121 

.1214 

.1077 

.0888 

.0895 

.0713 

.600 

.164 

.1629 

.1427 

.1280 

.1226 

.0972 

.500 

.210 

.2089 

.1827 

.1650 

.1596 

.1250 

.400 

.261 

.2479 

.2147 

.2083 

.2011 

.1600 

.300 

.318 

.3109 

.2733 

.2572 

.2486 

.1923 

.200 

.386 

.3703 

.3399 

.3271 

.3112 

.2429 

.100 

.482 

.4797 

.4280 

.4260 

.4180 

.3201 

.050 

.560 

.57  56 

.5043 

.4984 

.5045 

.4014 

,020 

.644 

.6453 

.6014 

.5810 

.5903 

.  5334 

.010 

.698 

•  6865 

.7085 

.6515 

.6039 

.5896 

.005 

.740 

.72  89 

.7444 

.6716 

.6124 

.6304 

LOWER  TAIL 

u/y-] 

0  •  c  • 

o.co 

-0.24 

1 

o 

• 

-r* 
r> j 

1 

o 

• 

O' 

cr 

• 

o 

« 

.900 

.038 

.0353 

.0376 

.0537 

.0474 

.0530 

.800 

.079 

.0753 

.0837 

.0997 

.1074 

.1137 

.700 

.121 

.1160 

.1339 

.1412 

.1614 

.1773 

.600 

.164 

.1559 

.1817 

.1810 

.2046 

.2394 

.500 

.210 

.1924 

.2311 

.2274 

.2715 

.3072 

.400 

.261 

.2517 

.2787 

.2899 

.3317 

.3750 

.300 

.318 

.3124 

.3276 

.3418 

.  4004 

.4369 

.200 

.386 

.3667 

.4079 

.4253 

.4729 

.5135 

.100 

.482 

.4637 

.4975 

.  5363 

.  5555 

.6045 

.050 

.560 

.5416 

.5800 

.6085 

.6296 

.6760 

.020 

.644 

.62  64 

.6696 

.7150 

.7011 

.7472 

.010 

.698 

.6889 

.7125 

.7491 

.7252 

.7794 

.005 

.740 

.7161 

.7571 

.8002 

.7600 

.  8249 

*  Dixon's  critical  valuesfrom  normal  distribution 


1.14 


0199 

0430 

0694 

0903 

1214 

1507 

1972 

2686 

3663 

4386 

5806 

6342 

6601 


1.14 


0508 
1130 
1700 
2363 
2  948 
3975 
4364 
5203 
6258 
6863 
7565 
8308 
8571 


450 


TABLE  II 


BETA  DISTRIBUTION 


N  -  6 

PR(  >R )  -  a 
UPPER  TAIL 


a/Y-i  j 

o.c. 

O.CO 

-0.24 

-0.42 

• 

o 

1 

-0.  96 

-1.14 

.900 

,056 

.0584 

.0510 

.0366 

.0403 

.0312 

.0327 

.eoo 

.113 

.1074 

.0987 

•  0661 

.0951 

.0715 

.0695 

.700 

.169 

.1647 

.1559 

.1274 

.1391 

.1105 

.1109 

.600 

.227 

.2192 

.2007 

.1786 

.1815 

.1-/9 

.1500 

.500 

.208 

.2746 

.2538 

.2356 

.2334 

.19  61 

.1994 

.400 

.350 

.3419 

.3087 

.3044 

.3022 

.2456 

.2460 

.300 

.420 

,4035 

.3704 

.3566 

.3644 

.3084 

.3099 

.200 

.502 

.4905 

.4585 

.4474 

.4526 

.  3789 

.3906 

.100 

.609 

•  6250 

.5621 

.5490 

.5610 

•  48  38 

.5049 

.050 

.689 

.6865 

.6345 

.6513 

.6272 

.  5957 

.5934 

.020 

.763 

.'7813 

.7146 

.7322 

.7489 

.  6832 

.7051 

.010 

.805 

.8203 

.8122 

.7615 

.7750 

.7773 

.7651 

.005 

.839 

.8369 

.8446 

.8166 

.6434 

.  8016 

.8072 

LOWER  TAIL 

a/Yi 

|  D .  C « 

O.CO 

-t 

CM 

• 

o 

1 

-0,42 

-0.64 

1 

O 

m 

& 

Q- 

-1.14 

.900 

.056 

.0460 

.0557 

.0724 

.0660 

.  06  81 

.0660 

.800 

.113 

.1022 

.1144 

.1332 

.1387 

.  1468 

.1411 

.700 

.169 

.1602 

.1765 

.1820 

.2106 

.  2161 

.2162 

.600 

.227 

.2135 

.2397 

.2340 

.2657 

.  2897 

.2851 

.500 

.288 

.2661 

.2954 

.2910 

.3339 

.  3767 

.3684 

.400 

.3  50 

.3265 

.3516 

.3610 

.4186 

.4453 

.4443 

.300 

.420 

.39  29 

.4192 

.4332 

.4873 

.  5155 

.5095 

.200 

.502 

.4603 

.5150 

.5319 

.5655 

.  5900 

.5911 

.100 

.609 

.5946 

.6201 

.6398 

.6722 

.  6952 

.7030 

.050 

.609 

.68  59 

.7093 

.7371 

.7336 

.7719 

.7638 

.020 

.763 

.7686 

.7809 

.8101 

.8011 

.  82  56 

.8649 

.010 

.805 

.8506 

.8452 

.8311 

.8738 

.  8515 

.8858 

.005 

.839 

.8775 

.8538 

.8674 

.8837 

.8762 

•  S  960 

451 


TABLE  III 


BETA  DISTRIBUTION 
N  «  6 

PR(r21  >R)  ■  0 
UPPER  TAIL 


cx/y-| 

o.c. 

O.CO 

t\j 

. 

o 

i 

-0.42 

-0.  64 

-0.96 

-1.14 

.900 

.268 

.2530 

.2521 

*  196 1 

.1903 

.1866 

.1851 

.800 

.364 

.3614 

.3321 

.2969 

.2917 

.2584 

.2522 

.700 

.439 

.4396 

.3876 

.3703 

.3668 

.3238 

.3145 

.600 

.504 

.4966 

.4634 

.4430 

.4356 

.3848 

.3840 

.500 

.563 

.5486 

.5199 

.5064 

.5012 

.4401 

.4571 

.400 

.621 

.6162 

.5842 

.5698 

.5623 

.5203 

.5176 

.300 

.630 

.6797 

.6412 

.6450 

.6133 

.5819 

.5828 

.200 

.745 

.7537 

.7083 

.7049 

.7051 

.6475 

.6654 

.100 

.821 

.8364 

.7992 

.7836 

.7985 

.7547 

.7534 

,050 

.872 

.8823 

.8605 

.8416 

.8643 

.8138 

.8079 

.020 

.924 

.9169 

.9147 

.9121 

.9031 

.8809 

.8852 

.010 

.951 

.9393 

.9303 

.9428 

.9296 

.9045 

.9248 

.005 

.970 

.9553 

.9476 

.9509 

.9607 

.9291 

.9623 

LOWER  TAIL 

a  /Yi 

D  #  C  • 

0.00 

-0  •  24 

-0.42 

-0.64 

-0.96 

-1.14 

.900 

.268 

.2345 

.2667 

.2757 

.2735 

.2971 

.3122 

.800 

.364 

.3402 

.3803 

.3781 

.3768 

.4300 

.4151 

.700 

.439 

.4195 

.4617 

.4685 

.4729 

.  5388 

.5003 

.600 

.504 

.4802 

.5074 

.5290 

.5405 

.6001 

.5748 

.500 

.563 

.  53  59 

.5662 

.5941 

.6000 

.6595 

.6428 

.400 

.621 

.5951 

.6332 

.6491 

.6739 

.7254 

.  6994 

.300 

.680 

.6662 

.6088 

.7174 

.7263 

.7639 

.7544 

.200 

.745 

.7398 

.7476 

.7819 

.7883 

.8168 

.8161 

.100 

.821 

.8160 

.8127 

.8407 

.  6606 

.8812 

.  8780 

.050 

.872 

.8589 

.8730 

.8814 

.9139 

.9202 

.  9039 

.020 

.924 

.9185 

.9091 

.9228 

.9507 

.9484 

.9354 

.010 

.951 

.9442 

.9281 

.9599 

.9639 

.9566 

.9560 

.005 

.970 

.9533 

.9477 

.9669 

.9725 

.9741 

.9671 

452 


TABLE  IV 


BETA  DISTRIBUTION 
N  *  6 


PRl  r  >R)  -  ot 
22 

UPPER  TAIL 


a/y 

D.C. 

O.CO  -0.24 

1 

o 

• 

ro 

-0.64  -0.96  -1.14 

.000 

.410 

.4132  .3692 

.3419 

.3234  .3042  .3144 

.800 

.540 

.5269  .4974 

.4003 

.4703  .4300  .4177 

.700 

.640 

.6116  .5838 

.5869 

.5654  .5550  .5145 

.600 

.720 

.6912  .6613 

.6578 

.6406  .6316  .5943 

.500 

.780 

.7548  .7273 

.7213 

.7002  .7059  .6815 

.400 

.830 

.7981  .7816 

.7806 

.7682  .7799  .7467 

.300 

.880 

,8577  .8445 

.8*58 

.8269  .8335  .8321 

.200 

.930 

.9120  .899a 

.90U 

.8939  .8955  .8867 

.100 

.965 

.9521  .9614 

.9573 

.9539  .9410  .9472 

.050 

.983 

.9793  .9788 

.9776 

.9763  .9646  .9757 

.020 

.992 

.9904  ,9932 

.9920 

.9903  .9849  .9921 

.010 

.995 

.9972  .9959 

.9949 

.9966  .9912  .9966 

.005 

.996 

.9985  ,9968 

.9961 

.9982  .9960  .9984 

LOWER  TAIL 

a/y. 

O.C. 

O.CO  “0.24 

-0.42 

-0,64  -0.96  -1,14 

.900 

.410 

.3726  .4277 

.4209 

.3903  .4205  .4233 

.800 

.540 

.4999  .5369 

.  5539 

.5300  .5921  .5606 

.  70G 

.640 

.5979  .6209 

.6390 

.6321  .7011  .6735 

.60  0 

.720 

,6793  .7110 

.7135 

.7088  .7608  .7423 

.  500 

.780 

.7617  .7588 

.7731 

.7876  .8263  .7979 

.400 

,8  30 

.8093  .8147 

.8324 

,8385  .8759  ,3523 

.  30  0 

.680 

.8563  .8721 

.  8861 

.8806  .9082  .8990 

.  20  0 

.930 

.9026  .9160 

.9273 

.9243  .9345  .9366 

.  100 

.965 

.9563  .9567 

.9630 

.9664  .9678  .9690 

•  0  6  0 

.983 

.9786  ,9796 

.9820 

.9878  .9838  .9872 

.020 

.992 

.9903  .9901 

.  9909 

.9941  .9926  .9962 

.010 

.995 

.9960  .9956 

.9938 

.9969  .9958  .9937 

.005 

.998 

.9990  .9968 

.9972 

.9990  .9967  .9992 

TABLE  V 


BETA  DISTRIBUTION 
N  *  10 

PRIr10  >R )  *  a 
UPPER  TAIL 


o/y; i 

O.C. 

O.CO 

-0.24 

1 

o 

• 

-0.64 

-0.96 

-1.14 

.000 

.025 

.0267 

.0210 

.0173 

.0157 

.0113 

.0123 

.800 

.051 

.0509 

.0384 

.0379 

.0325 

.0223 

.0249 

.700 

.080 

.0784 

.0596 

.0552 

.0523 

.0346 

.0348 

.600 

.110 

.10  54 

.0873 

.0838 

.0696 

.0510 

.0462 

.500 

.142 

.1354 

.1158 

.1107 

.0894 

.0680 

.0579 

•  400 

.178 

.1690 

.1506 

.1324 

.1115 

.0368 

.0774 

.300 

.219 

.2132 

.1832 

.1563 

.1384 

.1092 

.1015 

.200 

.273 

.2611 

.2252 

.2079 

.1734 

.1372 

.1269 

.100 

.349 

.3340 

.2841 

.2550 

.2326 

.1881 

.1762 

.050 

.412 

.3791 

.3471 

.2969 

.2801 

.2283 

.2115 

.020 

|  .483 

.4545 

.4109 

.3813 

.3385 

.2882 

.  3081 

.010 

.527 

.4923 

.4359 

.4154 

.3773 

.3249 

.3437 

.005 

.568 

.5219 

.4626 

.4232 

.3968 

.3566 

.3624 

LOWER  TA 

IL 

a/y, 

J-  i 

O.C. 

O.CO 

<\| 

• 

o 

I 

-0.42 

-0.64 

-0.96 

-1.14' 

.900 

.025 

.0252 

.0255 

.0326 

.0421 

.0382 

.  0464 

.800 

.051 

.05  34 

.0  556 

.0710 

.0650 

.0755 

.0905 

.700 

.080 

.0766 

.0827 

.1037 

.1004 

.1140 

.  1280 

.600 

.110 

.0941 

.1142 

.1413 

.1391 

.1527 

.1829 

.500 

.142 

.1209 

.1470 

.1823 

.1823 

.1957 

.2266 

.400 

.178 

.1535 

.1763 

.2211 

.2288 

.2426 

.  2733 

..00 

.219 

.1865 

.2263 

.  2636 

.  2704 

.3032 

.  3404 

.200 

.273 

.2446 

.2718 

.3163 

.3410 

.  3742 

.4006 

.  100 

.3  49 

.3257 

.3431 

.  3939 

.4069 

.4686 

.4986 

.050 

.41? 

.38  94 

.4193 

.4352 

.4666 

.5311 

.  5751 

.020 

.483 

.4486 

.4922 

.  5065 

.552  8 

.  5955 

.6392 

.010 

.527 

.4718 

.5186 

.  5296 

.5791 

.6286 

.6877 

.005 

.568 

.4970 

.5596 

.5513 

.  6208 

.  6832 

.6931 

454 


TABLE  VI 


8ETA  DISTRIBUTION 
N  -  10 

PR ( r  >R)  »  a 
11 

UPPER  TAIL 


k  /  v  i  : 

4  t 

O.C.  O.CO 

-0.24  -0.42 

1 

o 

• 

a 

S' 

-0.  96 

-1.14 

.V.O  ! 

.030  .0322 

.0273  .0222 

.0234 

.0168 

.0151 

•  yOO 

.063  .0597 

.0457  .0479 

.0432 

.0296 

.0333 

.700 

.098  .0940 

.0753  .0747 

.0672 

.0472 

.0496 

.600  : 

.134  .1261 

.1091  .1063 

.0896 

.06  93 

.  *>659 

.  500  . 

.173  .1621 

.1400  .1343 

.1158 

.  0884 

.  J342 

.400 

.216  .2008 

.1810  .1638 

.1412 

.  1157 

.1059 

.000 

.265  .2528 

.2269  .2050 

.1774 

.1430 

.1353 

.200 

.325  .3085 

.2758  .2592 

.2217 

.  1608 

.1736 

.  100 

.409  .3800 

.3573  .3328 

.2739 

.2389 

.2256 

.050 

,477  .4544 

.4241  .3903 

.3466 

.  2822 

.2873 

.020 

.551  .5257 

.4845  .4279 

.4217 

.3671 

.3753 

.010 

.597  .5641 

.5097  ,4491 

.4585 

.  3852 

.3999 

.005 

.639  .5729 

.5625  .5128 

.4911 

.4312 

.4262 

LOWER  TAIL 

a/Yj 

D»  C  •  0  •  C 0 

-0.24  -0.42 

-0.64 

-0.96 

-1.14 

.000 

.030  .0301 

.0304  .0384 

.0489 

.  0429 

.0531 

.800  ' 

.063  .0649 

.0664  .0820 

.0762 

.  0848 

.0979 

.700 

.098  .0932 

.0999  ,1166 

.1157 

.  1237 

.1421 

.600 

.134  .1149 

.1364  ,1672 

.1615 

.  1696 

.2022 

.500 

.173  .1461 

.1703  .2064 

.2056 

.2147 

.  2  503 

.400 

.216  .1831 

.2124  .2520 

.  2488 

.  2687 

.3021 

.  500 

.265  .2307 

.2662  .3031 

.3093 

.  3333 

.3  672 

.20  0 

.325  .2939 

.3251  .3687 

.3721 

.  3947 

.429/ 

.  100 

.409  .3625 

.4069  .4349 

.4577 

.4910 

.5333 

.050 

.477  .4463 

.4073  .5091 

.5113 

.  5635 

.6081 

.020 

.551  .5322 

.5709  .5786 

.6211 

.  6361 

.6765 

.010 

.597  .5740 

.6072  .6099 

.6770 

.  6800 

,7071 

.005 

.639  .5951 

.6521  .6374 

.6882 

.  7080 

.7297 

TABLE  VII 


BETA  DISTRIBUTION 
N  *  10 

PR(  r  >R  )  »  cx 
21 

UPPER  TAIL 


a/y. 

D.C. 

0.  CO 

-0.24 

1 

O 

» 

T' 

to 

1 

o 

• 

O' 

1 

o 

• 

vC 

O' 

-1.  14 

.900 

.130 

•  1226 

.1066 

.1048 

.0960 

.0746 

.0676 

•  BOO 

.199 

•  1804 

.1653 

.1515 

.1297 

.1079 

.1062 

.700 

.240 

.2343 

.2080 

.1937 

.  1683 

.1367 

.1348 

.600 

.286 

.2746 

.2481 

.2283 

,2055 

.1636 

.1613 

.200 

.329 

.31  56 

.2839 

.2651 

.2431 

.1910 

.1909 

.400 

.374 

.35  55 

.3186 

.3049 

.2746 

.2156 

.2212 

.300' 

.420 

.4058 

.3603 

.3468 

.3179 

.2532 

.  2529 

.200 

.474 

.4608 

.4158 

.3938 

.3556 

.  2986 

.2962 

.100 

.551 

.5420 

.4953 

.4719 

.4269 

.3809 

•  3605 

.050 

.612 

.5987 

.5534 

.5296 

.4894 

.4334 

.4193 

.020 

.678 

.6454 

.6264 

.6133 

.5712 

.5145 

.  5279 

.010 

.726 

.7241 

.6717 

.6282 

.6346 

.5731 

.5455 

.005 

.760 

.7353 

.7402 

.7318 

.6637 

.6049 

.  5853 

LOWER  TAIL 

a/Y1 

D.C. 

o.co 

1 

o 

• 

to 

-0.42 

-0.64 

-0.96 

-1.14 

.900 

.130 

.1243 

.1340 

.1545 

.1540 

.1651 

.2016 

.300 

.189 

.1776 

.1946 

.2194 

.2270 

.2502 

.2846 

.700 

.240 

.2  225 

.2447 

.2700 

.2947 

.3111 

.3384 

.600 

•  2  86 

.2643 

.2940 

.3213 

.3425 

.3595 

.3917 

.500 

.329 

..3065 

.  3373 

.3706 

.3847 

.4036 

.4381 

.400 

.374 

.3479 

.3795 

.4169 

.4349 

.4613 

,41393 

.300 

.420 

.3904 

.4314 

.4669 

.4840 

.5114 

.5375 

.200 

.474 

.4458 

.4879 

.5157 

.  5403 

.  5761 

.5955 

.100 

.5  51 

.5281 

.  5766 

.  5969 

.6104 

.6559 

.6747 

.050 

.612 

.5742 

.6287 

,6523 

.6644 

.7005 

.7221 

.020 

.678 

.6381 

.6877 

.7063 

.7140 

.7462 

.7662 

.010 

.726 

.6722 

.7427 

.  7641 

.7532 

.  7709 

♦  8032 

.005 

.760 

.7129 

.7652 

,7784 

.8059 

.8101 

.8526 

45b 


TABLE  VIII 


SETA  DISTRIBUTION! 
N  ■  10 


PR  1 r  >R }  =  a 
22 

UPPER  TAIL 


3  /  V , 

[  d  .  r. . 

o.co 

-0.24 

-0.42 

1 

o 

t 

4* 

-0.96 

-1.14 

.900 

.150 

.1492 

.1327 

.1320 

.1206 

.  0979 

.  0905 

.600 

.231 

.2192 

.1983 

.1929 

.1648 

.  1364 

.1379 

.700 

•  285 

.2718 

.2540 

.2325 

.2058 

.  1706 

.1768 

.600 

.335 

.3194 

.2960 

.  2783 

.2545 

.  2096 

.2  090 

.500 

.384 

.3690 

.3398 

.3158 

.2935 

.  2432 

.2470 

.400 

.433 

.4201 

.3822 

.3660 

.3364 

.  2790 

.2  804 

.300 

.483 

.4613 

.4251 

.4130 

.3  814 

.  3239 

.3215 

.200 

.543 

.5202 

.4925 

.4752 

.4291 

.3857 

.3765 

.'.00 

.620 

.6175 

.5767 

.5632 

.5146 

.4560 

.4499 

.050 

.632 

.6593 

.6439 

.6220 

.5922 

,  5046 

.5078 

.020 

.749 

.7297 

.7070 

.6896 

.6801 

.  5705 

,6303 

.010 

.791 

.7710 

.7555 

.7193 

.7235 

.  6604 

,6660 

.005 

1 

.826 

i 

.8133 

.7619 

.7586 

.7506 

.6813 

.6853 

LOWER  TAIL 

C‘/Y1 

0.  C. 

O.CO 

-0.24 

-0.42 

-0.64 

-0.96 

-1.14 

.900 

.150 

.1438 

.1  538 

.1871 

.1802 

.  1876 

.2247 

.  ft  0  0 

.231 

.2047 

,2257 

.2570 

.  2623 

.  2789 

.3154 

.  .  .,0 

.28  5 

.  25  75 

.2845 

.3086 

.3224 

.  3378 

.3776 

#  6  W  0 

.3  35 

.3135 

.3330 

.  3599 

.3809 

.  3967 

.4246 

.500 

.384 

.3553 

.3817 

.4191 

.4311 

•  4448 

.4786 

.400 

.433 

. 40  H  5 

.4332 

.4671 

.4797 

.4941 

.5315 

.>00 

.4  8  3 

.4613 

.4960 

.  5289 

.5312 

.  5533 

.  5834 

.200 

.  5  4  3 

.5220 

.560,3 

.  5766 

.  5944 

.  62  46 

.6352 

.100 

.620 

.60  32 

.6407 

.  6538 

.8724 

,  6995 

.7106 

.050 

.682 

.6596 

.6920 

.7115 

.7212 

.  7431 

.7680 

.020 

.7  49 

.72  06 

.7636 

.7780 

.7803 

.  7955 

.8182 

.010 

.791 

.7325 

.7  892 

.  8072 

.8196 

.  8321 

.8520 

.006 

.826 

.75  65 

.8023 

.3214 

.0609 

.  8578 

.8757 

TABLE  IX 


BETA  DISTRIBUTION 
N  *  1 5 

PR( r  >R)  »a 
10 

UPPER  TAIL 

i 

cjy  1  |  D.C.  O.CO  -0.24  -0.42  -0.  64  -0.96  -1.14 


.019 

.0172 

.0167 

.0106 

.0125 

.0111 

.0069 

.040 

.0365 

.0353 

.0202 

.0269 

.0206 

.0136 

.062 

.0565 

.0496 

.0364 

.0379 

.0281 

.0204 

.085 

.0772 

.0642 

.0525 

.0514 

.  0394 

.0282 

.111 

.1002 

.0817 

.0694 

.0660 

.  0495 

.0410 

.141 

.1268 

.1059 

.0891 

.0836 

.0601 

.0543 

.175 

.1557 

.1326 

.1142 

.1017 

.0719 

.0717 

.220 

.19  83 

.1749 

.1451 

.1301 

.  0923 

.0909 

.285 

.25  27 

.2277 

.1912 

.1636 

.  1267 

.1160 

•  338 

.2976 

.2652 

.2371 

.2018 

.  1599 

.1442 

.399 

.3428 

.3140 

.2830 

.2497 

.  2103 

.1790 

.438 

.3702 

.3319 

.3160 

.2789 

.  2605 

.1981 

.475 

.4068 

.3591 

.3398 

.2919 

.2819 

.2040 

LOWER  TAIL 

-0.24  -0.42  -0.64  -0.96 


.900 

.800 

.700 

.600 

.300 

.400 

.300 

.200 

.100 

.050 

.020 

.010 

.005 


a/Yj 

.9  00 
.0  00 
.700 
.600 
.  500 
.400 
.  300 
.200 
.  10  0 
.050 
.020 
.010 
.00  3 


D.C. 


.019 
.040 
.062 
.085 
.111 
.141 
.175 
.220 
.18  5 
.338 
.399 
,438 
.475 


O.CO 


.0164 
.0366 
.05  54 
.0763 
.0960 
•  12  66 
.1565 
.2062 
.2610 
.3155 
.  37  35 
•  4  3 
.4060 


.0211 

.0418 

.0620 

.0840 

.1123 

.1412 

.1823 

.2265 

.3071 

.3570 

.4239 

.4390 

.4800 


.0266 
.0461 
.  0738 
.0971 
.1255 
.  1603 
.2012 
.  2579 
.3353 
.  3746 
.4221 
.4543 
.  5262 


.0230 
.0503 
.0302 
.1071 
.1432 
.1765 
.2216 
.2669 
.3459 
.  3  984 
.4625 
.  5034 
.5150 


.0318 
.  0619 
.  0870 
.  1264 
.  1744 
.  2212 
.  2607 
.  3230 
.  3891 
.  4651 
.5313 
.  5651 
.  6813 


-1.14 


.0290 

.0676 

.1036 

.1511 

.2011 

.2343 

.2362 

.3476 

.4248 

.4919 

.5524 

.6045 

.6121 


T  A  tilt  X 


BETA  DISTRIBUTION 
N  =  15 

PR  (  r  >R  )  a  a 
11 

UPPER  TAIL 


ft/Yl 

a.c. 

O.CO 

-0.24 

-0.42 

-0.64 

-0.  96 

-1.14 

.'100 

.023 

.02  05 

.0194 

.0127 

.0149 

.  0144 

.0083 

•  \.l  Ij 

.0  47 

.04  22 

.0404 

.0264 

.0313 

.  02  56 

.0164 

.700 

.072 

.0646 

.0593 

.0433 

.0465 

.  0364 

.0275 

,  600 

.099 

.0870 

.0769 

.0640 

.0630 

,  0495 

.0375 

.500 

.129 

.1168 

.0986 

.0616 

.0800 

.  0611 

.0530 

•  400 

.164 

.1439 

.1267 

.1043 

.1006 

.  0745 

.0  696 

.500 

.203 

.1773 

.1583 

.1359 

.1225 

.  09  60 

,0866 

.200 

.253 

.22  78 

.1995 

.1741 

.1535 

.  1162 

.1097 

.100 

.323 

.2809 

.2606 

.2258 

.2025 

.  1603 

.1472 

.050 

.381 

,3313 

.3062 

.2687 

.2566 

.  1938 

.1397 

.0  20 

.445 

.4104 

.3579 

.3252 

.2  753 

.  2595. 

.2179 

.010 

.4e6 

.4276 

.3792 

.3595 

.3147 

.  32  04 

.2709 

.005 

.522 

.4608 

.3844 

.3921 

.  3303 

.  3688 

.2810 

LOWER  TAIL 

°/Yl  , 

|  D.C. 

O.CO 

-0.24 

-0.42 

-0.64 

-0,96 

-1.14 

.900 

.023 

.0188 

.0231 

.0291 

.0314 

.  0329 

.0326 

.,100 

.047 

.0408 

.0453 

,0525 

.  0560 

.  066  5 

.0764. 

.700 

.072 

.0627 

.0704 

.0821 

.  0363 

.  0944 

.1163 

.600 

|  .099 

.0883 

.0931 

.1075 

.1170 

.  1384 

.1609 

.>00 

j  .129 

.  11  32 

.1259 

.1408 

.1569 

.  1867 

.2126 

.400 

,164 

.14  05 

.1622 

.1771 

.1940 

.2314 

.2476 

.3  00 

!  .203 

.17  73 

.2090 

.2237 

.  2340 

.  2  797 

,  3  0  4  8 

.200 

.253 

.2368 

.2  565 

.2845 

.2  3o5 

.  34  39 

.3  696 

.  i.  0  0 

.323 

.29  33 

,3301 

.  3  a  2  7 

.  3707 

.4201 

.4453 

.  0  :j  0 

.381 

.3451 

•  3  668 

.4039 

.43  16 

,4803 

•  5 1  6- 

■  4#  —  *J 

.445 

.40  53 

.4518 

.4624 

.4390 

,  56o4 

.  599 1 

•  V  -  «J 

.486 

.43  98 

.4840 

.4951 

.  5226 

.  5277 

.6197 

•  Cj  G  > 

.522 

,4650 

.5251 

.  5339 

.  5471 

.  6842 

.  63  89 

TABLE  XI 


BETA  DISTRIBUTION 
N  «  15 


PR(  r  >R  )  ■  a 
21 

UPPER  TAIL 


a/Y, 

X 

D  •  C  • 

O.CO 

-0.24 

-0.42 

-0.64 

-0.96 

-1.14 

.goo 

.094 

.0850 

.0788 

.0600 

.0675 

.0460 

.0398 

.300 

.128 

.1317 

.1139 

.0922 

.0946 

.0653 

•  0623 

,7C0 

.175 

.1679 

.1421 

.  1189 

.1171 

.0848 

.0802 

.600 

.208 

.19  76 

.1742 

.1449 

.1390 

.1073 

.1006 

.500 

.245 

.2274 

.2011 

.1694 

.1603 

.1278 

.1168 

.400 

.280 

.2637 

.2255 

.2052 

.1848 

.1455 

.1328 

.2.00 

.319 

.2909 

.2621 

.2322 

.2168 

.1716 

.1511 

.200 

.366 

.3326 

.2988 

.2673 

.2534 

.2018 

.1779 

.100 

.431 

.4038 

•  3604 

•  3324 

.3004 

.2467 

.2263 

.050 

.483 

.4481 

.4069 

.3666 

.3391 

.2911 

.2722 

.020 

.537 

.4925 

.4773 

.4163 

.3869 

.3444 

.3145 

.010 

.574 

.5187 

.5111 

.4462 

.4036 

.3908 

.3367 

.005 

.607 

.5741 

.5415 

.4856 

.4320 

.4633 

.3957 

LOWER  TAIL 

*/Yi 

x-  1 

0  ft  c  • 

O.CO 

-0.24 

i 

o 

. 

-r 

N 

-r 

•o 

• 

a 

l 

-0.96 

-1.14 

.900 

.094 

*0B61 

.0867 

,1046 

.1161 

.1230 

.1270 

.300 

.138 

.1336 

.1455 

.1453 

.1635 

.1782 

.2043 

.700 

.175 

.1649 

.1848 

.1874 

.2050 

.2328 

.2500 

.600 

.208 

.1940 

.2202 

.2281 

,2444 

.2830 

.3080 

.500 

.245 

.22  32 

.2545 

.2717 

.2758 

.3304 

.3563 

.400 

.280 

.2553 

.2661 

.3090 

.3154 

,3731 

..4031 

•  300 

.319 

.29  74 

.3353 

.3576 

.3593 

.4255 

.4435 

.200 

.366 

.3417 

.3926 

.4084 

.4171 

.4739 

.5041 

.100 

.431 

.4063 

.4504 

.4715 

.4780 

.5378 

.5812 

.050 

.483 

.4523 

.4822 

.5188 

.5337 

.5637 

.6307 

.020 

.537 

.4950 

.5491 

.5768 

.5846 

.6532 

.6842 

.010 

.574 

.5309 

.5883 

.6181 

.6237 

.7050 

.7200 

.005 

.607 

.5556 

.6138 

.6430 

.6332 

.7393 

.7437 

460 


TABLE  XII 


BETA  DISTRIBUTION 

N  *  15 

PRI  r  >RJ  « 

22 

UPPER  TAIL 


z/\1 

D.C.  O.CO 

-0.24  -0,42  -0.64  -0.96  -1.14 

.900 
.  "GO 

|  .109  .0935 
;  .156  .1440 

.0967  .0746  .0764  .0525  ,0460 
.1322  .1073  .1068  .0730  .0734 

.70  0 

.196 

.1903 

.1623 

.1356 

.1373 

.1007 

.1001 

if 

.  o  00 

.234 

.2172 

.1968 

.  1660 

.1617 

.1298 

.1194 

1 

.500 

.273 

.2522 

.2291 

.2004 

.1820 

.1504 

.1371 

•  400 

.312 

.2864 

.2633 

.2319 

.2113 

.  1744 

.1536 

a 

.300 

.353 

.3259 

.2916 

.2565 

.2484 

.2012 

.1301 

.200 

.402 

.3661 

.3305 

.3084 

•  2888 

.2325 

.2120 

.100 

.472 

.4455 

.3913 

.3600 

.3457 

.2859 

.2669 

.050 

.525 

.4975 

.4657 

.4032 

.3860 

.3402 

.3047 

•  0.20 

.579 

.5352 

.4990 

.4545 

.4375 

.3985 

.3508 

.010 

.616 

.5765 

.5373 

.4935 

.4584 

.4508 

.3346 

.005 

.647 

.6157 

.5717 

.5311 

.4852 

.5004 

.4271 

LOWER  TAIL 

- 

a/Yi  I 

D.C. 

O.CO 

-0.24 

-0.42 

-0.64 

1 

o 

• 

vD 

O' 

-1.14 

.900 

.109 

.0955 

.0984 

.1144 

.1285 

.1307 

.1365 

.£00 

.156 

.1480 

.1584 

.  1627 

.1793 

.1915 

.2163 

.700 

.196 

.1863 

.2039 

.2032 

.2208 

.2448 

.2600 

.600 

•  U  t 

.2192 

•  £4  55 

.2512 

•  26^4 

.303^ 

.3192 

.500 

.273 

.2472 

.2773 

.2965 

.2965 

.3497 

.3757 

.400 

.312 

.28  74 

.3153 

.3343 

.3435 

.3968 

.4224 

.300 

.353 

.3243 

.3577 

.3773 

.3864 

.4484 

.4671 

.200 

.402 

.3755 

.4292 

.4353 

.4437 

.4944 

.5239 

.100 

.472 

.4^12 

.4833 

.5058 

.5056 

.  5599 

.5964 

.050 

.525 

.4944 

.5248 

.5442 

.  5552 

.6056 

.6496 

.020 

.579 

.5470 

.5657 

.6177 

.6204 

.6856 

.7056 

.CIO 

.616 

.5795 

.6172 

.6424 

.  6446 

.7139 

.7390 

.005 

.647 

.5923 

.6477 

.6462 

.6607 

.7508 

.7640 

461 


GAMMA  DISTRIBUTION 


N  *  6 


PRIr  >R  )  -  a 
10 

UPPER  TAIL 


Cl/ Y-l 

O.C. 

0.82 

0.89 

1.00 

1.15 

1.41 

o 

o 

• 

«*J 

.900 

.033 

.0462 

.0535 

.0483 

.0499 

.0717 

.0710 

.SCO 

.079 

.1023 

.1137 

.1025 

.1047 

.  1339 

.1455 

.700 

.121 

.15  56 

.1735 

.1493 

.1539 

.  1926 

.2214 

.600 

.164 

.2107 

.2343 

.1990 

.2  099 

.2538 

.2928 

.500 

.210 

.2632 

.2922 

.2576 

.2709 

.3179 

.3687 

.-00 

.261 

.3262 

.3437 

.3185 

.3360 

.3915 

.4339 

.300 

.318 

.4074 

.4275 

.3871 

.4221 

.4645 

.5303 

.200 

.386 

.4701 

.4977 

.4930 

.4994 

.5414 

.6132 

.100 

.482 

.5695 

.5965 

.5875 

.5885 

.6302 

.7130 

.050 

.560 

.6702 

.6773 

.6616 

.6507 

.7092 

.8088 

.020 

.644 

.7417 

.7380 

.7333 

.7249 

.7895 

.8497 

.010 

!  .693 

.7618 

.7761 

.8036 

.7929 

.6144 

.8756 

.005 

1  .740 

,7854 

.3056 

.8274 

.8434 

•  840  8 

.9024 

LOWER  T A ] 

[L 

*/y1 

|  0  ♦  c  * 

t 

0.82 

0.39 

1.00 

1.15 

1.41 

2.00 

.900 

1  .038 

.0232 

.0255 

.0305 

.0237 

.0172 

.0117 

.600 

.079 

.0508 

.0529 

.0503 

.0507 

.0354 

.0222 

.700 

.121 

.0732 

.0773 

.0792 

.0714 

.0546 

.0352 

.600 

.164 

.1096 

.1160 

.1037 

.0994 

.  0323 

.0518 

.  500 

.210 

.1452 

.1523 

.1367 

.1297 

.1043 

.0646 

.400 

.261 

.1830 

.1900 

.1699 

.1679 

.1310 

.0834 

.300 

.318 

.2255 

.2347 

.2119 

.2044 

.1603 

.1262 

.200 

•  3  6 

.2874 

.3101 

.2313 

.2514 

.2117 

.1826 

.ICO 

.4<=2 

.3639 

.4029 

.3748 

.3469 

.2842 

.2574 

.  C  50 

.560 

.4517 

.4853 

.4640 

.4037 

.3444 

.3270 

.020 

.644 

.5573 

.5556 

.5464 

.4859 

.4333 

.4-99 

.010 

.693 

.6253 

.5813 

.5946 

.5366 

.4762 

.5322 

.005 

.740 

.6444 

.6919 

.6367 

.5352 

.5325 

.  5328 

TABLE  XIV 


GAMMA  DISTRIBUTION 


N  *  6 


a/Y1 

D.C. 

0. 82 

PR(rn  >R>  *  a 

UPPER  TAIL 

0.89  1.00  1.15 

1.41 

2.00 

i 

.900 

.  C  5  6 

.0616 

.0681 

.0662  .0604 

.  03  61 

.0841 

.soo 

.113 

.1286 

•  1515 

.1309  .1230 

.  1632 

.1746 

■j 

.700 

.169 

.1972 

.2296 

.1868  .1836 

.2294 

.2624 

| 

.  60  G 

.227 

.2647 

.3034 

.255 2  .2636 

.3019 

.3369 

< 

.500 

.288 

.3319 

.3685 

.316.2  .3240 

.  3663 

.4211 

5 

.400 

.3  50 

.4058 

.4345 

•3839  .4211 

.4559 

.5027 

< 

.300 

.420 

.4766 

.5090 

.4853  .5003 

.  5254 

.5790 

r 

.200 

.502 

.5607 

.5941 

.5706  .5787 

.6051 

•  6580 

.100 

.609 

.6829 

.7026 

.6788  .6534 

.7193 

.7634 

.050 

.689 

.7579 

.7837 

.7589  .7375 

.7961 

.8425 

.020 

.763 

.7993 

.8418 

.8325  .8170. 

.e323 

.8971 

.010 

.805 

.8220 

.8768 

.8870  .8793 

.8723 

.9283 

.005 

.839 

.8513 

.9231 

.8995  .9071 

.8982 

.9429 

LOWER  TAIL 

U/Yj. 

D.C. 

0.32 

0.89 

1.00  1.15 

1.41 

2.00 

.900 

.056 

•  03  33 

.0448 

.0452  .0383 

.0304 

.0219 

.500 

.113 

.0747 

.0857 

.0758  .07 2ft 

.0615 

.0426 

.700 

.169 

.1159 

.1276 

.1136  .1162 

.0959 

.0661 

.600 

.227 

.1665 

.1775 

.1584  .1552 

.1260 

.0375 

.500 

•  288 

.2147 

.2187 

.2069  .1947 

.1646 

.1288 

.400 

.350 

.2773 

.2885 

.2608  .2510 

.2135 

.1726 

.300 

.420 

.3281 

.3717 

.3164  .3064 

.2661 

.2259 

.200 

.802 

.3995 

.4374 

.4100  .3609 

.3244 

.2894 

.100 

.609 

.4970 

.5627 

.5121  .4744 

.4176 

.4128 

.050 

.689 

.5893 

.6559 

.6126  .5625 

.  5347 

.5008 

.020 

.763 

.6355 

.7403 

.7372  .6683 

.6351 

.5640 

.010 

.805 

.7770 

.8347 

.7731  .7477 

.7134 

.7671 

.005 

.839 

.8150 

.8578 

.8318  .7666 

.7767 

.7840 

463 


TABLE  XV 


GAMMA  DISTRIBUTION 
N  -  6 


PR(  >R  }  *  a 
JRPER  TAIL 


a/v. 

O 

• 

o 

• 

0.62 

0,89 

1.00 

1.15 

1.41 

2.00 

.900 

.268 

.3243 

.3152 

.3060 

.3070 

.3420 

.3612 

•  SO  0 

.364 

.4033 

.4553 

.4153 

.3940 

.4599 

.5033 

.700 

.439 

.4917 

.5220 

.4885 

.4939 

.5460 

.5929 

.  600 

.504 

.5552 

.5791 

.5604 

.5788 

•  6066 

.6375 

.500 

.563 

.5211 

.6367 

.6314 

.6475 

.6671 

.7118 

.400 

.621 

.6830 

.6983 

.6838 

.6949 

.7103 

.7665 

.300 

.680 

.7453 

.7500 

•  7321 

.7506 

.7505 

.8136 

.200 

.745 

.8105 

.8009 

.7991 

.8102 

.8164 

.8676 

.100 

•  6  21 

.8616 

.6658 

.8671 

.8739 

.8852 

.9252 

.050 

.872 

.8928 

.8960 

.9181 

.9170 

.9262 

.9512 

.020 

.924 

•  92  31 

.9338 

.9587 

.9559 

.9543 

.9731 

.010 

.951 

.9462 

.9664 

.9662 

.9645 

.9671 

.9761 

.005 

.970 

.9662 

.9736 

.9742 

.9762 

♦  9749 

.9792 

LOWER  TAIL 

O.C. 

0.82 

0.89 

1.00 

1.15 

1.41 

2.00 

.900 

.263 

.2056 

.2025 

.1843 

.1913 

.1522 

.1044 

•  800 

.364 

.2805 

.2902 

.2700 

.2697 

.2271 

.1846 

.700 

.439 

.3545 

.3601 

.3220 

.3419 

.2882 

.2436 

.600 

.504 

.4161 

.4369 

.3836 

.4062 

.3366 

.2900 

.500 

.563 

.4761 

.4915 

.4450 

.4568 

.3928 

.3477 

.400 

.621 

.5448 

.5526 

.5163 

.5193 

,4652 

.4156 

.300 

.680 

.5998 

.6130 

.5772 

.5865 

,  5454 

.5002 

.200 

.745 

.6813 

.6346 

.6611 

.6599 

.6267 

.5766 

.100 

.821 

.7765 

.7736 

.7809 

.7838 

,  73S6 

.6922 

.050 

.872 

.8552 

.8418 

.8332 

.8403 

.3275 

.7644 

.020 

.924 

.9049 

.8903 

.8916 

.9030 

.8942 

.8429 

.010 

.951 

.9487 

.9164 

.9160 

.9385 

.9153 

.  8824 

.005 

.970 

.9745 

.9434 

.9340 

.9568 

.9253 

.9108 

TABLE  XVI 


GAMMA  DISTRIBUTION 
N  »  6 


r22>R>  »  a 
UPPER  TAIL 


a/Y. 

D.C. 

0.  £2 

o 

• 

CO 

sO 

1.00 

1.15 

1.41 

2.00 

•  9  w  0 

.410 

.4385 

.4711 

.4499 

.4411 

•  4684 

.  4646 

.  £00 

.540 

.5787 

.5  963 

.5518 

.5719 

.6240 

.6420 

:a 

.700 

.640 

.6702 

.6703 

.6579 

.6811 

.6939 

.7326 

1 

.  600 

.720 

.7426 

.7385 

.7329 

.7628 

.7594 

.7916 

| 

.500 

.7eo 

.7944 

.7927 

.7941 

.8161 

.8061 

.8432 

\ 

.400 

.530 

.8390 

.8461 

.0441 

.8592 

.8495 

.6825 

f 

.300 

.380 

.8864 

.8829 

.3774 

*9023 

.8867 

.9180 

i 

.200 

.030 

.9310 

.9132 

.9192 

.9381 

.9297 

>  94o8 

.100 

.965 

.9679 

.9534 

.9610 

.9676 

.9614 

.9758 

.050 

.983 

.9853 

.9818 

.9319 

.9866 

.9763 

♦  9904 

.020 

.992 

.9933 

.9941 

.9933 

.9955 

.9884 

.9970 

.010 

.995 

.9950 

.9974 

.9965 

.9976 

.9925 

.99e9 

.005 

|  .993 

.9980 

.9986 

.9982 

.9984 

.9963 

.9993 

LOWER  TAIL 

.. 

a/y1  i 

D.C. 

0.82 

0.39 

1,00 

1.15 

1.41 

2.00 

.900  i 

1  .410 

.3562 

.3423 

.3097 

.3433 

.  2939 

.2369 

.*00 

[  .540 

.4840 

.4730 

.4327 

.4763 

.4011 

.3656 

.700 

.640 

.5595 

.5607 

.5443 

.5657 

.4918 

.4612 

.600  | 

.720 

.63  35 

.6352 

.6148 

.6340 

.  5638 

.5419 

.500 

.790 

.6933 

.7068 

.6868 

.7082 

.6477 

.6278 

.400 

.330 

.7679 

.7657 

.7516 

.7758 

.7198 

.7067 

.  $  0  u 

«  5  PO 

.8276 

.3247 

.8182 

.8  382 

.  77  64 

.7703 

.200 

.930 

.  8  6  c  0 

.8779 

.6576 

.8949 

.  8458 

.8490 

.100 

.965 

.94  39 

.9293 

.9371 

.9495 

.  91  BP 

.9189 

.050 

.933 

.9769 

.9693 

.9619 

.9776 

.9516 

.962  8 

.020 

.992 

.96  98 

.9902 

.9847 

.9899 

.9780 

.9909 

.010 

.995 

.9941 

.9944 

.9921 

.9947 

.9902 

.9962 

.005 

.998 

.9978 

.9979 

.9990 

.9966 

.9922 

.9980 

465 


TABLE  XVII 


GAMMA  DISTRIBUTION 
N  -  10 


P R ( r  >R )  »  , 
r10 

UPPER  TAIL 


hi  ! 

0  *  C  w 

0.  82 

0.89 

1.00 

1.15 

1.41 

2.00 

.900 

.025 

.0400 

.0447 

.0382 

.0359 

.  0400 

.0552 

,300 

.051 

.0739 

.0817 

.0747 

.0821 

.0e42 

.1190 

,  7C0 

.080 

.1069 

.1221 

.1210 

.1292 

.1303 

.1723 

,600 

.110 

.1504 

.1631 

.1635 

.1807 

•  1B27 

.2259 

,500 

.142 

.1924 

.2021 

.2054 

.2273 

.23  83 

.3019 

,400 

.178 

.2377 

.2461 

.2672 

.2770 

.3036 

.3607 

,300 

.219 

.2901 

.3009 

.3226 

.3393 

.3612 

.4280 

,200 

.273 

.3608 

.3814 

.4021 

.4184 

.4436 

.4977 

,100 

.349 

.4319 

.4715 

.5030 

.5276 

.5359 

.6162 

,050 

.412 

.5027 

•  5266 

.5511 

.5898 

.6073 

.6979 

,020 

•  483 

.5787 

.6053 

.6274 

.6585 

.6652 

♦  7*58 

,010 

.527 

.6351 

.6357 

.6589 

.6819 

.7121 

.7671 

,005 

.568 

.6445 

.6862 

•  6882 

.7243 

.7281 

»  B013 

LOWER  TAIL 


a/Y1 

D  .  C . 

0.82 

0.89 

1.00 

1.15 

1.41 

2.00 

.900 

.025 

.0151 

•  U 1  7  9 

.0163 

.0150 

.  0096 

.0033 

.800 

.051 

.0362 

.0347 

.0299 

.0289 

.0204 

.0077 

.700 

.080 

.05  32 

.0517 

.0447 

.0393 

.0303 

.0130 

.600 

.110 

.0741 

.0726 

.0621 

.0560 

.0426 

.0202 

.500 

.142 

.0953 

.0918 

.0840 

.0721 

.0550 

.0277 

.400 

.178 

.12  29 

.1168 

.1027 

.0932 

.0678 

.0376 

.300 

.219 

.1449 

.1423 

.1291 

.1177 

.  08  u  ■> 

.0*91 

.  :  C  0 

.273 

.1852 

.1719 

.1658 

.1522 

.1147 

.0643 

.  '.00 

.349 

.2420 

.2291 

.2301 

.2036 

.1619 

.0922 

.050 

.412 

.3073 

.2860 

.2743 

.2*41 

.  2025 

.1270 

.020 

.483 

.3354 

.3631 

.3371 

.2802 

.2671 

.1575 

.010 

.527 

.3639 

.3993 

.3774 

.3394 

.3120 

.1735 

.005 

.568 

.3690 

.4483 

.3974 

.3512 

.3445 

.1835 

466 


TABLE  XVIII 

GAMMA  DISTRIBUTION  S#6 
N  -  1C 


PR<  rn  >R)  =  a 
UPPER  TAIL 


'/yi  1 

o.c. 

0 . 82 

0.89 

1.00 

1.15 

1.41 

o 

o 

• 

<M 

.900 

.020 

.0435 

.0507 

.0438 

.0405 

.0422 

.0576 

.200 

!  .063 

.0829 

.0924 

.0881 

.0911 

.0914 

.1265 

.700 

.098 

.1220 

.1334 

.1376 

.1478 

.1503 

.1502 

.600 

.134 

.1673 

.1839 

.1886 

.2006 

.2009 

.2346 

.500 

.173 

.2220 

.2335 

.2315 

.2521 

.2624 

.305y 

.400 

.216 

.2675 

.2791 

.2944 

.3105 

.3234 

.5745 

.300 

•  265 

.3275 

.3404 

.3551 

.3797 

.  3870 

.4441 

.200 

.325 

•  39  27 

.4221 

.4365 

.4508 

.4742 

.5172 

.100 

.409 

.  49  91 

.5175 

.5454 

.5653 

.5767 

.6305 

.050 

.477 

.5494 

.5794 

.6118 

.6340 

.6404 

.7276 

.020 

.551 

.6456 

.6443 

.6619 

.6952 

.7094 

.7714 

.010  i 

.597 

.6959 

.6893 

.7169 

.7348 

.7278 

•  7P99 

.005  1 

.639 

.7200 

.7412 

.7473 

.7725 

.7815 

.8133 

LOWER  TAIL 


a/Yj 

0-0. 

0.  62 

0.89 

1.00 

1.15 

1.41 

2.00 

.900 

.030 

.0211 

.0236 

.0244 

.0219 

.0138 

.0052 

.300 

.063 

.0423 

.0473 

.0418 

.0405 

.0303 

.0130 

.700 

.093 

.0721 

.0709 

.0618 

.0579 

.0449 

.  0203 

.600 

.134 

.09  49 

.0961 

.0852 

.0795 

.0634 

.  0293 

.500 

.173 

.1277 

.1230 

.1134 

.1019 

.0774 

.  0406 

.400 

.216 

.1537 

.1563 

.13  9  5 

.1314 

.0974 

.0585 

•  300 

•  2  65 

.19  23 

.1344 

.1702 

.1619 

.1231 

.0760 

.200 

.325 

.24  69 

.2304 

.2108 

.2081 

•  1624 

.0970 

.100 

•  4Q9 

.31  30 

.3013 

.2985 

.2598 

.218? 

•  1^21 

.050 

.477 

.3803 

.3769 

.3358 

.3127 

.2619 

.1358 

.020 

.551 

.4277 

.4634 

.4714 

.3633 

.3816 

.2323 

.010 

.597 

.4716 

.5138 

.  5006 

.3853 

.4028 

.2845 

.005 

.639 

.4994 

.5336 

.5449 

.4244 

.4391 

.3072 

467 


TABLE  XIX 


GAMXA  DISTRIBUTION 
N  -  10 

PR (  r21>R)  «  a 
UPPER  TAIL 


a/Y, 

D.C. 

0.82 

0.39 

1.00 

1.15 

1.41 

2.00 

.900 

.130 

.1646 

.1656 

.1894 

.1319 

.2151 

.2265 

.300  1 

.189 

.2349 

.2521 

.2599 

.2771 

.3004 

.3260 

.700  1 

.2^0 

.2934 

.3166 

.3206 

.3431 

.3582 

.3963 

.600  ! 

.286 

.3456 

.3716 

.3699 

.3992 

.4153 

.4735 

.500 

.329 

.3993 

.4207 

.4209 

.4507 

.4629 

.5256 

.400 

.374 

.4614 

.4674 

.4669 

.5023 

.5220 

.5831 

.300 

.420 

.5116 

.5135 

.5255 

.5575 

.  5766 

.6334 

.200 

.474 

.5619 

.5728 

.5955 

.6197 

.6296 

.6943 

.100 

.551 

.6423 

.6456 

.6606 

.6983 

.6931 

.7629 

.050 

.612 

.7103 

.6878 

.7209 

.7514 

.7591 

.8166 

.020 

.678 

.7755 

.7550 

.7660 

.7975 

.8004 

.3510 

.010 

.726 

.7890 

.7686 

.8088 

.3196 

.8281 

.3627 

.005 

.760 

.8346 

.7959 

.8174 

.8414 

.8501 

.8926 

LOWER  TAIL 

a/Y1 

D.C. 

0.32 

0.89 

1.00 

1.15 

1.41 

2.00 

.900 

.130 

.0923 

.0986 

.0933 

.0066 

.0763 

.0351 

.800 

.189 

.1290 

.1306 

.1301 

.1222 

.  1025 

.0567 

.700 

.240 

.1743 

.1726 

.1617 

.1580 

.1286 

.0749 

.600 

.286 

.2048 

.2055 

.1931 

.1818 

.  1484 

.0935 

.500 

.329 

.2401 

.2408 

.2230 

.2165 

.  1749 

.1104 

.400 

.374 

.2934 

.2698 

.2586 

.2512 

.2067 

.1335 

.300 

.420 

.3387 

.3071 

.2960 

.2847 

.  2384 

.1632 

.200 

.474 

.3894 

.  3656 

.  3385 

.3293 

.2813 

.2006 

.100 

.551 

.4510 

.4429 

.4116 

.3976 

.3538 

.2569 

♦  050 

.612 

.5099 

.5048 

.4791 

.4468 

.4283 

.3124 

.020 

.678 

.6164 

.5883 

.5640 

.5054 

.  5024 

.3930 

.010 

.726 

.6751 

.6391 

.6322 

.5609 

.  5521 

.4209 

.005 

.760 

.7087 

.6638 

.6746 

.6026 

.  5785 

.4629 

TADLE  XX 

GAMMA  DISTRIBUTION 
N  -  10 

PK<r22  >R )  *  a 
UPPER  TAIL 


o/y, 

X 

>  D  •  v#  • 

0.  82 

0.89 

1.00 

1.15 

1.41 

2.00 

.900 

.150 

.1946 

.1882 

.2198 

.2127 

.2404 

.2  424 

.soo 

.231 

.2716 

.2816 

.2954 

.3122 

.3273 

.3  561 

.700 

.295 

.3300 

.3548 

.3544 

.3766 

.3911 

.4327 

.  CfO  C 

.335 

.30  56 

.4133 

.4109 

.4367 

.4511 

.5006 

.500 

.394 

.44eO 

.4672 

.4622 

.4943 

.  5039 

.5520 

•  400 

.433 

.5000 

.5147 

.5179 

.5503 

.5590 

.6084 

.300 

.483 

.5610 

.5648 

.5705 

.6078 

.6183 

,6648 

.200 

.543 

.6192 

.6214 

.6285 

.6685 

.6670 

.7226 

.100 

.620 

.6831 

.6370 

.7081 

.7353 

.7464 

.8029 

.0  50 

.682 

.7500 

.7348 

.7662 

.7907 

.7917 

.8502 

.020 

.749 

.8078 

.8062 

.3169 

.8329 

.6400 

.8788 

.010 

.791 

•  S3  49 

.8319 

.8398 

.8579 

.8691 

•  8999 

.005 

.826 

.8810 

.8597 

.8527 

.8708 

.6870 

.9082 

LOWER  TAIL 

«/y1 

1  D.C. 

0.  82 

0.89 

1.00 

1.15 

1.41 

2.00 

.900 

.150 

.1140 

.1253 

.1193 

.1159 

.1050 

.0523 

.POO 

.231 

.1697 

.1653 

.1598 

.1573 

.1365 

.0807 

.700 

.285 

.2164 

.2136 

.2050 

.2017 

.1700 

.1043 

.600 

.335 

.2551 

.2530 

.2413 

.2381 

.1974 

,1717 

.500 

.384 

.3082 

.2940 

.2795 

.2698 

.2327 

.1576 

.400 

.433 

.  3  5  10 

,3303 

.3239 

.3092 

.2646 

.  1 P  96 

.300 

.483 

.4116 

,3752 

.3645 

.3592 

.3  066 

.2214 

.200 

.543 

.4624 

.4388 

.4222 

.4082 

.3638 

.  2674 

.100 

•  620 

.5397 

.5175 

.4863 

.4368 

.4433 

.  3542 

.050 

.682 

.59  94 

.6095 

.5649 

.5460 

.5229 

.4250 

.020 

.749 

.  69  66 

.6629 

.6499 

.613o 

.5971 

.5172 

.010 

.791 

.7281 

.7042 

.7076 

.6519 

.6555 

.5842 

.005 

.826 

.8212 

,7063 

.7444 

.6746 

.6745 

.6113 

469 


TABLE  XXI 


GAMMA  DISTRIBUTION 
N  -  15 


PR<r10  >R)  “  ° 
UPPER  TAIL 


a/y. 

A. 

D.C. 

0 . 82 

0.89 

1.00 

1.15 

1.41 

2.00 

■700 

.019 

.0340 

.0263 

.0339 

.0330 

.0351 

.0431 

.300 

.040 

.0653 

.0563 

.0645 

.0691 

.0741 

.0990 

♦  700 

.062 

.0981 

.0865 

.0962 

.1056 

.1184 

.1401 

.600 

.085 

.1365 

.1201 

.1361 

.1418 

.1600 

.1935 

.500 

.111 

.1749 

.1557 

.1760 

.1857 

.2033 

.2409 

.400 

.141 

.2139 

.1901 

.2211 

.2293 

.2570 

.2970 

.300 

.175 

.2586 

.2425 

.2756 

.2863 

.3162 

.3707 

.200 

.220 

.3140 

.3159 

.3358 

.3611 

.3861 

.4452 

.100 

•  285 

.3890 

.3947 

.4249 

.4423 

.4764 

.5327 

.050 

.333 

.4499 

.4646 

.4354 

.5055 

.5561 

.5910 

.020 

.399 

.’5063 

.5372 

•  5093 

.5535 

.6232 

.6915 

.010 

.438 

.5414 

.5649 

.5781 

.5941 

.6678 

.7512 

•  005 

.475 

.5769 

.5859 

.6267 

.6126 

.6997 

.7628 

LOWER  TAIL 

D.C. 

0.S2 

0.89 

1.00 

1.15 

1.41 

2.00 

.900  1 

.019 

.0123 

.0103 

.0116 

.0090 

.0069 

.0027 

.800 

.040 

.0269 

.0202 

.0237 

.0190 

.0145 

.0046 

.700 

.062 

.0416 

.0315 

.0370 

.0284 

.0214 

.0076 

•  600 

.085 

.0578 

.0433 

.0511 

,0394 

.0287 

.0117 

.500 

.111 

.0705 

.0572 

.0639 

.0513 

.0366 

.0161 

.400 

.141 

.0893 

.0751 

.0796 

.0638 

.0464 

.0210 

.300 

.175 

.1099 

.0915 

.0982 

.0798 

.0623 

.0276 

.200 

.220 

.1352 

.1205 

.1226 

.  1017 

.0834 

.0350 

.100 

.285 

.1819 

.1564 

.1593 

.1349 

.1108 

.0555 

.050 

.338 

.2208 

.1964 

.1900 

.1681 

.1376 

•  0698 

.020 

.399 

.2813 

.2480 

.2407 

.1930 

.1723 

.1007 

.010 

.438 

.3124 

.2704 

.2757 

.2272 

.1985 

.1307 

.005 

.475 

.3543 

.2839 

.2854 

.2377 

.2332 

.  1463 
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I 


TABLE  XXI  I 
GAMMA  DISTRIBUTION 
N  *  15 

PR(rn  >ft)  «  a 
UPPER  TAIL 


a/Yi  ! 

D.C.  0. c2 

0.S9  1.00 

1.15 

1.41 

2.00 

.‘>00 

.023  .03  5-2 

.0297  .0370 

.0364 

.0372 

.0449 

•  BOO 

.047  .0731 

♦0603  .0693 

.0759 

.0795 

.1003 

.700 

.072  .1114 

.0964  .1011 

.1129 

.1243 

.1471 

.600 

.099  .1528 

.1313  .1494 

.1552 

.  1 6ft4 

.1982 

.500 

.129  .1920 

.1674  .1907 

.1984 

.2135 

.24S0 

.400 

:  .164  .2320 

.2079  .2386 

.2413 

.  i.  6  66 

.3031 

.300 

!  .203  .2322 

.2582  .2923 

.3066 

.3254 

.3  801 

.200 

!  .253  .3454 

•3356  .3616 

.3811 

.4030 

.4513 

.100 

.323  .4178 

.4242  .4483 

.4593 

.4954 

.5432 

.050 

1  .381  .4776 

.4841  .5126 

.5383 

.  5820 

.5945 

.020 

.445  .5309 

.5595  .5587 

.5778 

.6515 

.7012 

.010 

.486  .5796 

.6098  .5988 

.6237 

.6976 

.7572 

.005 

.522  .6138 

.6292  .6508 

.6413 

.7217 

.7703 

LOWER  TAIL 

°/y1 

.  D.C.  0.82 

0.89  1.00 

1.15 

1.41 

2.00 

•  900 

.023  .0171 

.0123  .0170 

.0127 

,0100 

.0040 

,800 

.047  .0332 

.0263  .0317 

.0244 

.0201 

.0072 

.700 

.072  .0525 

.0406  .0465 

.0365 

.0295 

.0117 

.600 

.099  .0742 

.0536  .0632 

.0520 

.0373 

.0166 

.500 

.129  .0923 

.0715  .0788 

.0685 

*  050S 

.0227 

.400 

.164  .1113 

.0944  .1032 

.0814 

.0653 

.0300 

.300 

.203  .13  60 

.1156  .1261 

.1019 

.0801 

.0376 

.200 

•  253  .16 o5 

.1404  .1590 

.1335 

.1022 

.0493 

.100 

.323  .2210 

.1935  .2005 

.1745 

.1436 

.0719 

.050 

.331  .2776 

,2433  .2453 

.2067 

.1805 

.1024 

.020 

.445  .3375 

.2923  .2379 

.2541 

.216? 

.1290 

*  G  i  C 

.486  .3841 

.3295  .3081 

.2365 

.2411 

.1564 

.005 

i  .522  .4126 

.3547  .3214 

.3081 

.2935 

•  1602 
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TA3LE  XXIII 


GAMMA  DISTRIBUTION 
N  ■  15 

PR<  r  >R)  -  a 
r21 

UPPER  TAIL 


&/Yi 

X. 

D.C. 

o.  n 

0.89 

1.00 

1.15 

1.41 

2.00 

.000 

.094 

.1296 

.1246 

.1329 

.1342 

.  1670 

.  1959 

•  600 

.138 

.1926 

.1787 

.1862 

.1950 

.2347 

.2663 

.700 

.175 

.2435 

.2379 

.2401 

.2446 

.2854 

.32  32 

.600 

.208 

.2833 

.2631 

.2902 

.2852 

.3337 

.3316 

.500 

.245 

.3251 

.3251 

.3409 

.3353 

.3696 

.4286 

.400 

.280 

.3726 

.3746 

.3919 

.3926 

.4392 

.4771 

.300 

;  .319 

.42  47 

.4223 

.4410 

.4493 

.4864 

.5296 

.200 

i  .366 

•  47  54 

.4710 

.4959 

.5038 

.5443 

.5940 

.100  j 

i  .431 

.5352 

.5410 

.5609 

.5850 

.6212 

.6622 

.050 

:  .483 

.5886 

.5343 

.6202 

.6312 

.6769 

.7143 

.020 

.537 

.63  53 

.6803 

.6645 

.6677 

.7445 

.7718 

.010 

.574 

.6750 

.7164 

.6973 

.7007 

.  7336 

.  7999 

.005 

i 

.607 

.7013 

.7227 

.7663 

.7264 

.7977 

.8134 

LOWER  TAIL 

o/Yj 

D.C. 

0.82 

0.89 

1.00 

1.15 

1.41 

2.00 

.900 

.094 

.0658 

.0593 

.0555 

.0521 

.0432 

.0172 

.300 

.138 

.0979 

.0828 

.0829 

.0785 

.0600 

.0262 

.7CC 

I  .175 

.1293 

.1077 

.1058 

.0967 

.0770 

.0368 

.600 

.208 

.1529 

.1303 

.1283 

.1123 

.0944 

.0451 

.500 

.245 

.1712 

.1494 

.1516 

.1339 

.1109 

.0544- 

.400 

.280 

.1956 

.1769 

.1793 

.1555 

.1233 

.0657 

.300 

.319 

.2232 

.2051 

.2080 

.1793 

.  1547 

.08  07 

,200  ! 

.3  66 

.2552 

.2444 

.  24  53 

.2142 

.1791 

.  0095 

.100 

!  .431 

.3203 

.2918 

.  2936 

,2512 

.2271 

.1360 

.050 

.483 

.3573 

.3419 

.3273 

.  2S59 

.2554 

.1594 

.020 

.537 

.4169 

.3777 

.3612 

.3589 

.2927 

.2033 

.010 

.574 

.4545 

.4146 

.4038 

.3752 

.3064 

.2335 

.005 

.607 

.5040 

.4885 

.4327 

.4313 

.3541 

.2409 
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TA8LE  XXIV 


GAMMA  DISTRIBUTION 
N  =  15 


PR (  r22>R)  «  a 
UPPER  TAIL 


a/y,  | 

j  •  c  • 

0.  ?2 

0.39 

1.00 

1.15 

1.41 

2.00 

.  ,  00 

.100 

.1417 

.1392 

.1421 

.1416 

.  1663 

.2030 

s 

T 

,0  00 

;  .  1 56 

.2146 

.1966 

.2001 

.2061 

.2515 

.2746 

.700 

.196 

.2607 

.2535 

.2553 

.2603 

.2994 

.3365 

f 

.600 

!  .234 

.3077 

.3045 

.3086 

.3051 

.3530 

.3902 

t 

1 

.500 

.273 

.34  39 

.3553 

.3630 

.3563 

.4116 

.43  59 

f 

.  *tOC 

.312 

.3939 

.3939 

.4093 

.4136 

.4573 

.4979 

i 

l 

f- 

.300 

.353 

.4520 

.4514 

.4647 

.4764 

.5096 

.5404 

•  200 

.402 

•  5051 

.4936 

.5249 

.5339 

.5694 

.6056 

.100 

.472 

.5717 

.5665 

.  5886 

.6150 

.  6459 

.6776 

i 

i 

.050 

.525 

.6190 

.6294 

.6390 

.6528 

.6980 

.7298 

i 

.020  j 

.579 

.6732 

.6366 

.6936 

.6967 

.7551 

.7765 

.010 

.616 

.6935 

.7294 

.7287 

.7269 

.7926 

.  8036 

f 

i 

i 

.005 

i 

.647 

.7514 

.7752  .7781  .7348 

LOWER  TAIL 

.3019 

.8178 

r  a/Y- 

D.C. 

CM 

• 

o 

0.S9 

1.00 

1.15 

1.41 

2.00 

,900 

.109 

.07  36 

.0735 

.0649 

•  0636 

.  0537 

.0223 

.300 

.156 

.1179 

.1022 

.1018 

.  0919 

.0755 

•  02  56 

!  ,700 

.196 

.1521 

.1283 

.1278 

.1137 

.0992 

.0474 

.600 

.234 

.1765 

.  1  524 

.1541 

.1357 

.1179 

.  0589 

.500 

.273 

.1993 

.1822 

.1764 

.1592 

.1354 

.0720 

.400 

.312 

.2261 

.2160 

.2087 

.1927 

.  1616 

.0346 

.500 

."53 

.25  77 

.2448 

.  2402 

.2171 

..1870 

.  1C23 

.200 

.402 

.3027 

.2316 

*  2320 

.2493 

.2134 

.  1256 

,.00 

•  472 

.3617 

.3478 

.  3447 

.2377 

.2635 

.1628 

,050 

.525 

,42'-6 

.3  541 

.3943 

.3530 

.2014 

.  19CS 

.5  20 

.579 

.4701 

.4433 

.4280 

.4177 

.  3656 

.2501 

.010 

.616 

.  52  27 

.4600 

.4676 

.4403 

.  3955 

.2333 

•  0  o  5 

.647 

.5442 

.5258 

.4969 

.4579 

•  4055 

.3124 
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APPENDIX  B 

TABLES  LISTING  SIGNIFICANCE  LEVEL  AT  WHICH 
BETA  AND  GAMMA  STATISTICS  DIFFER  FROM 
NORMAL  STATISTICS 


f 

£ 

f 

r- 


v 
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T  : L  . ;  X  X  •/ 


KOLf'C OORCV-S’1  I  WO v  OOOONESS-OF-F  IT  TEST  RESULTS 


Y1 

c 

-.24  - 

,L  2  - 

■  S  4  -  .  -it 

“  1  •  1 

MA 

u  •  s . 

.01 

S  |  _ 

•  •  “ 

UPPER 

.01 

c 

TAIL 

.01  .01 

.01 

r,*9 

•..  s. 

.01 

.0! 

.01 

.Cl 

.01 

v*  ‘  1 

r21 

‘22 

N ,  S . 

.01 

.01 

.01  .01 

.01 

.05 

.01 

.01 

.01 

.01 

.01 

Ti  A 

V ,  s . 

N.S. 

LOVER 

,10 

TAIL 
*  0 ±  #  0 . 

.01 

;j  0 

N.S. 

N.S. 

N.S. 

.01 

.01 

.01 

roi 

>!  •  S . 

N.S. 

.05 

.01 

.01 

.01 

.21 

‘22 

.10 

N.S. 

N.S. 

N.S.  .01 

N.S 

*10 

N.S. 

.01 

N  * 

UPPER 

.01 

10 

TAIL 
.01  .01 

.01 

r‘  1 
r21 

N  i  S • 

.01 

.01 

.01 

.01 

.01 

N  .  S . 

.01 

.or  - 

.01 

.01 

.01 

r22 

■N.S. 

.01 

.01 

.01 

.Cl 

.01 

r10 

.Cl 

N.S. 

LOWER 

.01 

TAIL 

.01  .01 

.01 

*  A  A 

'21  ■ 

.01 

N.S. 

.01 

.01 

.01 

.01 

.05' 

N.S. 

.01 

.01 

.01 

•  01 

r22 

.10 

N.S. 

.01 

.01 

.01 

.Cl 

'  N  =  13 


r’0 

.05 

.01 

UPPER 

-.01 

TAIL 

,01  ,01 

.01 

rM 

r21 

.05 

.01 

.01 

.01 

.Cl 

.01 

•  V#1  — 

.01 

.01 

.01 

.01 

.01 

r22 

.01 

.01 

.01 

.01  .01 

.01 

r10 

.05 

N.S. 

LOWER 
,0  5 

TAIL 

•  r  • 

•  •  aw* 

r\  * 

0  W  4. 

r-, - 

.05 

•N.S. 

N.S. 

.01 

.01 

.01 

r21 

r22 

.01 

N.S. 

n  * 

•  w  . 

.01 

.01 

.01 

.01 

N.S. 

.05 

.01  .01 

.01 

*  N.S.  MEANS  NOT  SIGNIFICANTLY  DIFFERENT 
FROM  DIXON'S  VALUES  AT  .1C  RISK  LEVEL  OR  LOWER 


4 

i 

i 
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T  ■  ’  i‘.  L  X  X  V  [ 

KOLMOGORGV-S:'.!  -.,v:v  GOOONCSS-OF-FIT  TEST  RESULTS 


0  *  c  2 

C.PO 

"  C\  "i 

±  •  \J  \J  4, 

.15  1 

.41 

2.00 

N  * 

6 

UPPER 

TAIL 

rJ0 

r*1 

r22 

.Cl 

.01 

.01 

.01 

.01 

.01 

.01 

♦  01 

.01 

.01 

.01 

.01 

.01 

.01 

.01 

.01 

.01 

.01 

.01 

.01 

.01 

■N  •  S  . 

.05 

N.S 

LOWER 

TA  IL 

.01 

.01 

.01 

.01 

.01 

.01 

r-- 

.01 

.01 

.01 

.01 

.01 

.01 

r21 

.Cl 

.01 

.01 

.01 

.01 

.01 

r22 

.01 

.01 

.01 

.01 

.01 

.01 

N  ■ 

10 

UPPER 

TAIL 

fl° 

r22 

.Cl 

.01 

.01 

.01 

.Cl 

.01 

.01 

.01 

.01 

.01 

.01 

.01 

.01 

.01 

.01 

.01 

.01 

.01 

.01 

.01 

.01 

.01 

.01 

.01 

LOWER 

TAIL 

po 

r21 

r22 

.01 

•or 

.01 

.01 

•  01 

.01 

.01 

.01 

.01 

.01 

.01 

•  01 

.01 

.01 

.01 

.01 

.01 

.01 

.01 

.01 

.01 

.01 

.01 

.01 

N  ■ 

15 

UPPER 

TA  IL 

•A 

H? 

.01 

.01 

.01 

.01 

•  01 

.01 

.01 

.01 

.01 

.01 

.01 

.01 

A  '  ^ 

r2“ 

r22 

.01 

.01 

,01 

.01 

.01 

.01 

.01 

.01 

.01 

.01 

.01 

.01 

LOWER 

TAIL 

Ti  /-i 

.  c « 

,01 

.01 

.01 

•  Cl 

.01 

r-!? 

*\  1 

.  w  . 

.01 

.01 

.01 

.01 

.01 

->  • 

r21 

r22 

.01 

.01 

.01 

.01 

.0: 

.01 

.01 

.01 

.01 

.01 

.01 

.01 

*  N.S.  "FANS  NOT  SIGNIFICANTLY  DIFFERENT 
F C M  DIXON'S  VALUES  AT  .1C  RISK  LEVEL  OR  LOWER 
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APPENDIX  C 

MACHINE  PROGRAMMING  OF  DISTRIBUTIONS 
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JERRY  THOMAS  -  BET  A  DISTRIBUTION 
BLOC ( RR-RR600 ) 

BLOC!  <30-00599  JHH-HH599  )BB-B8599)4 
BLOC( CC-CC599 )00-DD599 )EE-E£599 ) 

BLOC (ZZ-ZZ 1300 )ZZZ-ZZZ 13) 

BLOC(Q-Q600) 

BLOC(Nl-N80)Yi-Y40)GGl-GG400)Al-A40)Bl-B40)4 
SYN  (Xl*N2 )GX»GG1 )GY«GG101)GZ*GG201 )GW*GG301 ) 

START  ENTER: SETOPO) 

RE  AO  (  RCC  >  $  ENTER(CVFT0I  XRCC)  <RC) 

ZZZ  =  . 90$  ZZZl*.804  ZZZ2*.704  III 3*. 604  ZZZ4-.50 
ZZZ5-.404  ZZ Z6-.304  ZZZ7-.204  ZZZ8-.10* 
ZZZ9-.054  ZZZ10-.024  ZZZ11-.014  ZZZ12-.0054 
SET(HP»1)W«15)G*0 )4 

AGAIN  READ-F0RMAT( H)-( 80 )N0S.AT{N1)4  ST0R*08 

INC ( HP*HP+1 ) $ 

NEX»094  ENTERl PRINT  B) 

MM" 04  MMM"04SET (WW«0 ) 4  ENTER(PRINT  B)S 
ENTER( ZEROCC ) 4  SET  (ALP* ALP 1 ) (BTA"BTA1)4 
ALPHA  SET ( K«0 ) L"0 ) 4  EPS*.000014 

1.0  N-N1.K4  X*X1 »K4  0*04  IF ( X<100 ) GOTO ( 1*1)4 

IF.C  N-X<2  >  GOTO:  1.1)4  Y-X/N4  G0T0U.7) 

1  A-1/:n+1)4  B*l6C*X+l$J*N-X4I*14lNT(RB*RB*RA)4 

BRLESC4B8IRB) (/OOMMM)  (RB)4 
enter: CVXT OF ) (RB) ( R) 

1.2  IF(B>J>GOTO< 1.3)4 

A»A#R/C4  B-B+1S  C-C  +  14  G0T0(1,2) 

1.3  A**A4G0T0 ( »  8TA ) 4 

1.4  YY*Y/ 1 1-Y ) 4  F" ( 1-Y  ) 4 
IF{Y<0)OR(Y>1)GOTO(2.3) 

1.41  ss«exp:x'clog:y)+j«log:f)) 

D-D+14  IF(D>50)G0T0(2.0) 

I F ( Q>47 ) GOTO ( 2.1) 

1.43  IF(Y>.95)GOrO(1.8) 

S"S0*SS#Y/(X+1)4  1*1 

1.5  if:  i-j+i) within: .ocd goto:  1.6) 

S*S*YY( J-I+1)/(X+I+1) 

SO*SO+S %  1*1  +  14  goto: 1. 5) 

1.6  GOTO ( f  ALP  )  4 

ALP3  ET*<S0-A*R)/SS4IF(D>21G0T0(1.61)S  ET-ET/2 

1.61  Y*Y-ET4IF< Y-l) WITH  INI .0001 )GOTO( 1. 7 ) 4 

I F ( Y>1 ) GOTO( 2.4 ) 


1.65 

1.7 


17.3 

7STFWD 

10.44 


10.22 


11.55 


3.11 

3.01 

3.0 

3.1 


3.22 


3.3 

3.4 

3.5 

3.6 

3.7 
3.03 


[  F-ARS  (  FT>FO<;  ir-OTO  (1  .4) 

IF( Y>1 )G0T0(2.4)  S  Yl.L-Y 
[NC(K=KK2)f.  I NC ( L “ L+l ) 

iF-iNr;L<sroK)Gorod.o)s  inciresc-RESC+i > 

SET  <  M  =  0 ) 

MM-MM  +Y1,MS  MMM-MMM+(Y1, M*Y1,M  It 
COUNT  tW)  IN ( M ) GOTO (  17.8)  S  S£T(M«0)i 
IF(Y1.M<=Y2,M) GOTO (10.22) 

Y50-Y1  »K'i>  Yl.M-Y2,Mt  Y2»M«Y50$  P-M 
IF-INT(P*0)G0T0(TSTFWD)$  INC(P-P-I) 

I F {  Y).,P<»Y2,P)GOTO(TSTFWD5  $ 

Y50-Y1.PS  Y1.P-Y2.PS  Y2.P-Y50*  GOTO(10.44)$ 

COUNT(W-l) IN(M)GOTOITSTFWD) 

0»WW=(Y,W-Y, (W-l)  )/(Y,W-V2)  i 
BP,WW=  ,'Y.W-Y,  (W-2)  )/(Y,W-Y2)i 
OOtWW* (Y,W-Y, (W-l>  )/(Y,W-Yl)$ 

HM» WW*( Y2-Y1 1 / ( Y.W-Yl )S 

E6.WW- ( Y3-Y1 )/( Y. I W-2)-Yl )  \ 

CCtWW-(Y3-Yl  )/(Y»  CW“1)“Y1)  $  | 

DO,WW«( Y,W-Y, ( W-2) )/( Y»W-Y3)$  I 

RR,WW»(Y2-Yl)/(Yt(W-l)-Yl)$  INC < WW-WW+1 ) i  1 

IF-INT(RESC<RC)GOTG»ALPHA)  I 

ENTER(CVlTOF)(W)  (FQ)$DID»MM/(FQ*600U 

SIO*  ( F0«600’fMMM-MH#MM) /  (FQ^iOO (  FQ*600-1 )  )$  ] 

PRINT<MEAN  ■  >OID<  VARIANCE  *  >SID  I 

SET { P-0)  J 

SET ( M-0 )  • 

l r ( QQ»  M<-QQ1 » M ) GOTO ( 3.22 ) 

Y50=0Q,K$  QOiM-QQlfK*  QQ1.M-Y50*  L-M  i 

IF-INT(L*0)GOTO(3.0)6  INCU-L-1)  i 

IF(QO»l<*QQl»UGOTO{3.0)S 
Y50-00,L5Q0fL*0QlfLS  QQ1.L-Y50*  G0T0(3.1)S 
COUNT! 599 ) IN (M) GOTO! 3.0) 

ZZ  »G=Q059i  ZZ1.G-QC119J  ZZ2.G-0Q177t  ZZ3.G-QQ239 
ZZ4.G-QQ299S  ZZ5» G=QQ359$  ZZ6.G-QQ419JZZ7, G-QQ479 
ZZ 8,G=OQ539S  ZZ9.G-QQ569J  ZZ10.G-QQ5S7J. 

ZZli,G»QQ593S  ZZ 12.G-QQ596*  INC(G«G+13)i 
INC  C  P-P  +  l ) *  IF-1NT(P>  7)G0T0( 18.69) $ 

I F- INT ( P* 1 ) GOTO (3.3)$  I F- INT I P»2  )GOTO( 3. 4 ) i 
l F-INT ( P* 3 ) GOTO (3.5)$  I  F- INT( P-4 ) G0T0( 3. 6 ) 

( F- 1  NT <  P=5)G0T0(3. 7) $ 

IF-INT(P-6)G0T0(3.03)5ir-INT(P-7)GOT0(3.8) S 
MOVE ( 600 ) NOS .FROM ( FH) TO ( OQ ) $  G0T0(3.Cl) 

MOV E( 600) NOS. FROM { 0  ) TO ( OQ ) $GOTO ( 3 . 01 ) 

MOVE ( 600) NOS, FROM ( PR ) TO ( QQ > *GOTO ( 3. 01 ) 

MOVE (600) NOS. FROM ( 8B)TC(QQ)*  GOTO! 3.01! 

MO V5 ( 600) NOS. FROM { CC)TO(OQ )$  GOTO( 3.01) 
M0VE(600)N0S.FR0Ml  CD)TO(QQ)*  GOTO! 3.0i> 
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f'CV  F  (  600  ) MOS  . FROM  (  EE  )  TO  (  QQ  )  $  GOTO  (3.01) 

KKIN1<  KB  > 

ENTER(SEXAPR)  (  R  F3 )  <  RB  ) 

CL E AR ( 400 ) NO  S , AT ( GG1 ) $  S£T!RESC»0)i  V=OS  VA*OS 
K2-0%  UA=0S  UR=0$  UC«0$  UX*0$  UY=0*  UZ*0$  M4*0» 

Ml =0 i  K2=CSUD*0S  UW«0*  SET(K=0)L»0$ 

I F- INI ( HP>6 ) GOTO [ 16.11 )  i  GOTO! AGAIN )$ 

.3  S-SO»SS*F/ ( J  +  l J  $  I«1 

1.9  S«S1X-I+1)/YY*< J+I+l) 

SO-SO+SS  I- I ♦IS  l  F  ( I><X+.001)G0T0(1.9) 

SO-A-SOS  GOTO! 1.6) 

2.0  PR I N7<  ERROR> 

PRXMTt  R1 YIN) X  I ET) *  GOTO(l.O) 

2.1  PRINT<R)Y)N)X)ET)D>S  G0T0(1.43) 

2.2  YSEXP(L0G!R)/!N+1 ) >$G0T0(1.7)$ 

2.3  PRINK  Y)N)  X)R)ET)  D)S  G0T0(1.41) 

2.4  SET ( PPaO  ) 

2.5  ET=ARS!ET/2)  i  Y«Y-ET4-  INC!  PP-PP+1 )  $ 

IF-INT( PP>20 )GOTO( 2.6)$  IF ( Y>I )G0T0( 2.5) $ 

I F!Y*i) WITHIN!. 0001  ) GOTO! 1.7)$  GOTO!  1.65  ) 

2.6  CLEAR!  400 )  NOS  .AT  (  GG1 )  S  SET!RESC»0)$  V«0$  VA«OS 
M1*0 $  M2=0$UD*0$  UW*0$ 

M3«0$  UAaO$  UB»0$  UC-OS  UX»0$  UY«0$  UZ»0$M4»0$ 

I F-I NT ! HP>6 ) GOTO! 16.11)$  G0T0!AGAIN)$ 

F5  FORM! 3-14) 12-4)3-2) 1-1 >12-6)3-1) 1-6) 

H  FORM! 10-10 110-10) 

RB  SEX ABRL ESC ( 00L7N J60L3K 5LS003 ) i 

RA  SEXABRLESC(OO422NK8S0KCOK425) 

V«!X-1)/N$A«A* (N  +  l  )/X$X«X-l$  GOTO! 1.4) S 
X-X+14  IF(N<X+H-.0Cl)G0T0!BT2.1)$  Y«!X+1)/N 
A-AMN+1  )/!N-X)$J«J-l  $  GOTO!  1.4)  $ 

S0»A»1  SGOTO  (  »  ALP  )  $ 

Y*X/N$  IF!  N*X  )  WITHIN!  ,.QOl  JGOTO!  2 .2  )  $  GOTO  (1.4)  5 
Zl» ! X-l )/N$Z  2* ( X+l )/N$ I F{ Z  2< 1 )GOTO( 2*1)%  Z2»l 

Y«Z1  +  (  Z2-Z1 )  !R-A1 *  L ) / ! B1  »L-A1 »  L )  $  G0T0H.7) 
A1,L*S0/ ASSET! ALP*ALP2 JGOTO l BTA2 ) S 
B1.L*S0/AS  INC <  K*'<+2 ) l L"L+1 ) $ 

SET! ALP-ALP1) (BTA*DTA1)$ 

IF-INT!L<ST0R)60T0(1.0)$ 

SET ! ALP«ALP3 ) !BTA» BTA3) GOTO! ALPHA )$ 

SET { M*0 ) K*0 ) $ 

PRINT- (F5)-!ZZZ»K) <6 ) NOS. AT! ZZ , M/ 208 ) $ 

I NC ( K«K+1 } $  INC ! M*M+1 )$ 

I F-I NT ( M>207 ) GOTO ( N.PROB ) S 

I F- INT ( K<13) GOTO! 14.4)$  SET(K*OJ$  G0T0114.4) 

LIST 

END  GOTO(START) 


8TAI 

BTA2 

BT2.1 

BTA3 

2.7 

ALP1 

ALP2 


16.11 

14.4 
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JERRY  THOMAS  -  GAMMA  DISTRIBUTION 


BLOC ( U-U6Q0 ) V-V6GQ ) W-W 600  JXX-XX600 ) 

BLOC(0-Q&00) RR-RR600 ) 

BLOC!  WW-WW600 ) VV-VV600  JX-X700) EE-EE12) 

BLOC! $-$30) Y-Y50  ) 

SYN  (Z=0?6) (CV=087)(ZZ*G 1=038) (G2»089) ( J=02) $ 

SYN  (Z1*09S)(T1*08X)5 
START  R£A0-(F1)-(NR)M)$ 

ENTER ( SETDPO  )  i 

EE7=.2i  EE8= .IS  EE9*.05S  EE10-.02S 
FE*.9S  £  E 1  * .  8  S  Ec2*.7$  EE3-.6*  EE4*.5$ 

EE5=.4i  Et 6= . 3$ 

EE11*.01$  EE  12* *005$ 

SET  f HP*0 ) $SET(H*0 ) S 
PRINT(2  3><R>14XALPHA>19><Z>17>$$ 

.1  REA0-! F2 ) - ! A  )  S 

INC(Hp»HP+l) 

SET ( E*0 ) P*0 ) D*0 ) $ 

IF- 1  NT {09=0) GOTO! 10,0)6  SET(K=0>$ 
IF!A*0)G0T0<2.2)$ 

SET ( FL*1 )  t  A • *A+1 S  ENTER! L GAMMA) A* ) LGA) *  ' 

£NTER{ CVFTOI )A)AI )* 

GOTO ( 2« 3  )  $  i 

,2  SET ( Ft=2  )  $ 

.25  SET ( K*0  j  i 

.3  B4( I R2 ) ( /LLZ ) ( IR2 ) 

.3  MXR! IRl ) IR2 ) 0 ) $  SHR( 0 50 ) IR2 ) $  R-IR2S 

TP(/Z7U/ZLL)R)  S  A!R)050)R)* 
IF(R>l)GR(R<O)GOT0!ll.O)$ 

GOTO,  FU2. 4)2.8)$ 

.4  Y'=A$  CFY«1-R$ 

.5  S*EXP( A^LOG! Y' )-Y * -LGA ) $  SET(I«1)$  C0N«0*SUM=Si 

.6  S.I«U-C0N)*$,U-1  )/Y»$  SUM  =  SUM+S,  I  $$ 

C0N*C0N+1 6  COUNT! A  1  +  15  IN! I )G0T0!2.6)$ 

DY« ( CFY-SUM ) /S$  Y*=Y»-OY$ 

!F-A8S-N0T(DY<. 0001) GOTO! 2.5 )i  GOTO (3.0) * 

.8  Y* *-LOG( 1-R) $ 

.0  Y,K»Y‘$ 

COUNT !M) IN! K)GOTO( 2.3)$  SET(K»0)$ 

TSTFWO  IF! Y,K<*Y1,K)G0T0( 10.22) 

Y60*  Y ,  ;< i  Y » K»Y  1 ,  K $  Y 1 » K  *Y6  Oi  P*K$ 

10.44  IF-INT! P*0)G0T0!TSTFWD) i  iNC(P*P-l)$ 

IF !YfP<=Yl,P) GOTO! TSTFWO) i 

Y60*Y , P 4  Y , P=Yl , P $  Y1.P-Y60S  GOTO(10.44)$ 
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1C. 22 
10.33 


9,5 

11.1 

11.44 


11.22 


3.3 

3.4 

3.5 

3.6 

3.7 

3.8 

3.9 
6.3 


13.0 


10.0 

11.0 

ERROR 

GAMMA 

LGAMMA 


COUNT (M-l )  IN(K.)GOTG(TSTFwD) 

0,E=CY,  (M-l)-Y,  (M-2)  )✓(  Y,  (M-l)-Y) 

U, E=  i  Y,  t.V-1  )-Y,  (M-2)  )/[  Y,  (M«l)-Yl)S 

V ,  E  =  (  Yl-Y  )  /  (  Y »  (M-2  )-Y)$ 

W  j  E= ( Y , (M-3)-Y, (  M  -  1 ) )/{ Yl-Y« (M-l)  >$ 

XXtE=CY2-Y)/ CY, <M-2)-Y) 

WW,E»(Y,(M-31-Yt(M-l) )/<Y2-Yt (M-l) )  $ 

VV,E*(Y2-Y)/ (Y, (M-3)-Y) 

RR t  E= ( Yl-Y ) / ( Y» ( M-i )-Y ) $  INC(E«E+1)$ 

COUNT ( NR ) INC D) GOTO (2.25)$ 

PRINK  RIO  L0WER> 

SET ( P=0 ) 

SETCF»0) 

IFCRR,E<-RR1,E)G0TC< 11.22) 

Y60=RR»  ES  RR,E*RR1»E$  RR1,E*Y60$  L=E 
IF- 1  NT C  L=0 ) GOTO ( 11 .1) S  lNC(L=L-i) 

IF ( RRt  L<=RR1 1 L ) GOTOC 11.1)$ 

Y60*RR»  L$  RRtL«RRltL$  RR1.L»Y60$  G0TCC11.44)$ 
COUNT ( NR— l 1 INCE)GOTO(ll.l) 

X»  H*R(\59  $  XI «  H*RR1 19$  X2tH»RR179$  X3,H*RR239$ 
X4,H«RR299$  X12»H»RR596$ 

X5tH»RR359$  X6,H«RR419$  X7,H-RR479$  X8,H*RR539$ 
X9»H«RR569$  X10,H«RR587$  X11»H«RR593$ 

INC  C  H»H+13 ) $ 

INCt  P*P  +  l ) $  IF-INT(P>7) GOTOC 6.3) 
l F~ INT  t  P*1 JGOTOC3.3) S  I F- INTI P«2) GOTOC 3.4 ) $ 

I F-I NT  t  P»3  JGOTO (3.5)$  IF-INTCP*4)G0T0C3.6)$ 
f F-I NT l P»5 )GOTO (3.7)$ 

IF-INT(P*6)G0T0(3.8)$  I F- I NTC P*7)GOTO(3. 9) $ 

MOVE (600 ) NOS • FROM ( 0 ) TO ( RR ) $  GOTOC 9. 5 ) 

MOVE ( 600 )N0S .FROM (U) TOC RR) $  GOTOC 9. 5 ) 

MOVE ( 600  > NOS . FROM { V ) TO <  RR ) S  GOTOC 9 . 5 ) 
M0VE(600)N0S.FR0MCW)TOCRR) S  GOTOC 9. 5) 

MOVE ( 600 ) NOS. FROM ( XX) TOC RR)$  GOTOC  9.5) 

MOVE ( 600 ) NOS • FROM C  WW )T0 ( RR ) $  GOTOC 9. 5) 

MOV EC 600) NOS. FROM ( VV)TO(RR>$  GOTOC 9. 5) 

I F- INT ( HP<6 ) GOTOC  2.1) 

SET  t  H=0 ) ( P*0 ) 

PRINT- ( F6 )-( EE,P) ( fcJNOS.AT (X.H/104) 

I NC  C  H*H+1 ) $  INC(P*P+1 )$  IF-INT(H>103)G0TO(10,O)S 
IF-INTCP<13)G0T0( 13.0)$  SET(P=0)$  G0T0(13.O)$ 
ENTERC  SEX APR ) IR2) IR2)* 

GOTOCN.PROB) $$ 

PRINT <R® > ( R ) $  GOTOCN.PROB) $ 

PRINT-(F3)-CR)A)Y* )DY)$  GOTOCN.PROB) $ 

TP6CSELF  +  1X045M  ALPH)$  J(  AL1XAL2)(  SELF+2) 

TP1 1C  SELF-1) ( 046 ) ( ALPH) $  AXC 1 ) ( EX1 ) { EXIT ) * 
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1.2 


1.3 


~«4 


S70R 

-:;:t 


D4 

:u 

1R2 

m  * 

•  4. 

,52 

F  5 

F3 

,*4 

Fo 


TP(3  ) (047 ) ( STOR  )  $ 

2*t  2  t  2 i= »  2 1  CV*1$ 

I F (  2 >D 1 >GOTO( 1.3) $ 

CV=CV*Z,  Z«Z+051»$  GOTO ( 1*1) 

CR121CV)  IE.1)S  IF-ABS(CV<1>G0T0(1.2>$ 

ZZ»1/Z*ZS  SET ( J®0 ) $  B12(C0)))$ 

PKA(ZZ) (0)(C1,JJ$  LPX, J)(6)(1.4)$  D ( 0 ) ( Z  )  ( G1 ) S 
G2=  LOG ( Z ) $  S ( Z ) ( 02  J  (  0 ) $  M < G2 ) < 0 ) < G2 ) * 
AA(Gl)(u3)(G2)$  G2*G2-ZS 
GOTO ( AL  i)i 

Gl* AflS  ( CV )  S  Gl=LOG  ( G 1 )  S  S  (  G2 > ( Gl 1 ( 0 ) $  GOTO(STOR)* 
;F(G2>04)GOTO(E.1> S  Gl»EXP ( G2  J  $  D( Gl ) (C V) ( 0) S 
R1 2 ( 0 ) (0) (0) S 
GOTO ( EX  IT ) $ 

GOTO ( 1 ) S 

SX(EXIT) (EX1 J (1)0  SET 1 2  =  EWD  )  ( 3*Z1 ) GOTO! 060) 
ALFNGAMMA 
CEO  (10.) 

DEC  (.5) 

DEC  (.91893853320467267301) 

DEC  (.00641025641) 

DEC  (-.001917526918 X .C008417508418) (-.0005952381) 

DEC  (.0007936507936508  X-. 002777777777777777) 

DEC  (.083333333333333333) 

DEC  (350.) 

CSC  (17450580596923828125) 

SEXA(C579K2F59S9820KS6) 

FORM (4-10) 1-2) 

FORM (10-10 ) $ 

F0RM(12-6)3-2)l-10) 

FORM (12-6-13) 3-2) 12-2-6) 3-4) 12-6-13) 3-2) 12-6-13) $ 

FO RM ( 3-14 ) 12-6-1 3 ) 3-10 ) 12-2-6) 3-10) 12-6-13) 3-14) S 
FORM ( 3-14 ) 12-4)3-2 ) 1-1 >12-653-1)1-6)$ 

LIST 

END  GOTO ( ST  ART ) $  $ 


l 


1 


A  83 


A  METHOD  FOR  APPROXIMATING 
rSCSASILITT  Fund IONS  UfiiMNSD  ON  FINITE  DOMAINS 

Joseph  S.  Tyler,  Jr. 

Syscems  Analysis  Office 
Edgewood  Arsenal,  Maryland 


I.  INTRODUCTION .  The  Incentive  for  this  paper  arose  from  the 
requirement  to  determine  approximately  the  probability  density  function 
h(d)  of  the  random  variable  D  from  a  knowledge  of  the  first  r-moments 
(about  the  origin)  of  that  variate. 

Specifically,  the  moments  are  computed  from  equation  (l«0). 


(1.0)  M  (D)  »////  Dr(rJt  n  ,  u,  v)  f  (5,  n.  u,  v)  d^dndudv 

Q 

for  (r«0 ,1,2, .. . ,N) . 

Where  D  is  a  known  continuous  function  and  f  is  a  known  continuous 
probability  density  function  of  the  variates  5,  n»  u,  v,  Moreover, 
the  range  of  D  is  known,  0  <_  D  <  1,  and  the  integration  is  performed 
over  the  Euclidean  four-dimensional  space  0. 

It  has  been  demonstrated,  by  H.  Hamburger  1920  (Ref  1),  that 
when  the  domain  of  definition  of  a  probability  function  is  finite 
then  that  function  is  uniquely  determined  by  the  set  of  all  of  its 
moments.  A  method  of  constructing  a  probability  density  function 
defined  on  [-1,1],  from  the  infinite  set  of  its  moments,  has  been 
published  by  Philip  Davis  in  his  book,  INTERPOLATION  AND  APPROXIMA¬ 
TION,  1961  (Ref  2).  The  method  is  essentially  an  infinlta  series 
expansion  in  Legendre  polynomials.  However,  from  a  statistical 
viewpoint,  it  is  not  practical  to  construct  the  required  function  h 
from  the  entire  set  of  its  moments.  The  purpose  of  this  paper, 
therefore,  is  to  present  a  method,  employing  only  the  first  r-moments, 
by  which  nonnegative  approximations  of  probability  density  functions 
on  [0,1]  can  be  constructed. 

Essentially,  the  approximation  method  la  based  on  an  iterative 
procedure.  The  first  step  utilizes  the  first  r-moments  of  the  random 
variable  D  to  specify  the  initial  approximation  to  the  function  h. 
Secondly,  successive  improvements  over  the  initial  approximation  are 
achieved  by  applying  a  modified  version  of  the  classical  method  for 
representing  continuous  functions  by  orthonormal  polynomials.  The 
error  of  the  approximation  is  measured  in  terms  of  the  given  original 
first  r-moments  of  the  variate  D. 

II.  ESSENTIAL  ASPECTS  FROM  THE  CLASSICAL  THEORY.  In  general, 
any  continuous  function  g(x)  defined  on  the  finite  interval  [0,1], 
can  be  expanded  in  a  series  of  weighted  orthonormal  polynomials 
w(x)  j  c4  6  (x). 
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Specifically,  it  will  be  requited  that  the  following  eet  of 
conditions  be  satisfied: 

A.  g(x)  c  C'10,1] ,  (i.e.,  the  function  g  and  its  first 
derivative  be  continuous  on  the  closed  Interval  [0,1]). 

B.  A  sequence  of  polynomials  { t(x) }  continuous,  bounded 

and  orthonormal  with  respect  to  some  weight  function 
w(x)  _>  0  [0,1]  are  known. 

C.  w(x)  g(x)  and  w(x)  g2(x)  be  integrable  on  [0,1]. 

The  sequence  of  polynomials  {0  (x)}  satisfy  the  following 
properties :  “ 

[PlJ 

[P21 

[P31 

tP4l 


I 


w(x)  61(x)  6j(x)  dx  -  0, 


/ 


w(x)  6*(x)  dx  -  1 


J  xr  w(x)  0i(x)  dx  -  0, 


r<i 


/ 


P  (x)  w(x)  8, <x)  dx  »  0, 

III  1 


m<i 


a  >0,  denotes  the  coefficient  of  xn  in  8  (x) 
n  n 

[Pj]  an+1>0,  denotes  the  coefficient  of  xn+*  in  8n+^(x) 
(Pm(x)  denotes  any  polynomial  of  degree  m). 


Under  the  orthogonality  conditions  on  {6^},  the  expansion  of 
g(x)  can  be  expressed  as: 


(1)  g(x)  ■ 


Y_  V  6i<*>*  ci  -/  W 


(x)  g Cx)  8i(x)  dx. 


i-0 
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Le!  S  (v)  the  partial  SUDu  , 


Sn<*>  *  I  °1  V 


Then  by  che  definition  of  c^'s  and  Che  orthonormal  properties  of 
Che  8^'s,  we  have 

i  A  n 

(1.2)  J  w(x)[g(x)  -  Sn(x)]2  dx  -j  w(x) [g(x)]2dx  -  c2  . 

0  0  i-0 

Now  that  the  first  member  of  equation  (1.2)  is  nonnegative,  the  same 
is  true  of  the  second  member  and, 


i  -uf- 


(x)[g(x)]  dx,  for  all  value*  of  n. 


Consequently, 


U 

I-i 


is  convergent,  for  n  and 


limit  ■  0. 


Hence,  we  conclude  that  S^(x)  converges  to  g(x)  in  the  least  square 
sense  over  the  finite  interval  [0,1]. 

Under  the  assumption  that  g'(x)  is  continuous  on  [0,1],  it  can 

be  demonstrated  that  S  (x)  converges  to  g(x)  for  every  xe[0,l]  as  n 

n 

increases  without  bound. 

The  Chris  toff el-Darboux  identity  (Ref  3)  provides  the  following 
symmetric  kernel  function  Kn(x,t),  of  order  n,  for  the  system  of 

polynomials  6^ (x) .  That  is , 


Kn(x,t)  -  Kn(t,x)  -  X  ei<t>  •±w 

i-0 
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Kn(x‘t)  “  r 


“n  °n+l  °n^x;  “  Vw  Wx' 

_—  ——————— 


where 


a  >  0  la  the  coefficient  of  x  in  6  (x) 
n  n 

a  . ,  >  0  is  the  coefficient  of  xn+1  in  6_ . , (x) 
n+1  n+1 


a  . ,  >  a  . 
n+1  n 

By  utilizing  thifl  identity,  we  may  express  Sn(x) ,  equation  (1.1), 

as 

(2.1)  S  (x)  -  [  w(t)  g(t)  K  (x,t)  dt, 

n  '  0 

end  from  the  orthonormal  properties  of  the  6i'a  we  have, 


1  -  j  w(t)  K  (x,y)  dt. 
J  r\  11 


Multiplication  of  equation  (2.2)  by  g(x),  which  la  constant  with 
raapcct  to  the  variable  of  integration  givea 


(2.3)  g(x)  «  f  w(t)  g(x)  K  (x,t)  dt. 

J  Q 

Hence,  by  subtraction  of  equation  (2.3)  from  (2.1), 


'  A 


(2.4)  S  (x)  -  g(x)  -  j  w(t)[g(t)  -  g(x)]  K  (x,t)  dt. 
n  y  q 

Then,  by  substitution  of  K  (x,t),  equation  (2),  in  equation  (2.4)  one 

n 

obtains,  for  an  arbitrary  xe[0,l],  the  relation. 


g(t)-g(x) 


s  (*)  -  g(x) 
n 


iwt>  * 


(2.5) 


-en+l(*> 


/  .(«,  — 
J  0  c  - 


g(c)-g(x) 


8n(t)  dt 


Ihe  proof  that  Sn(x)  convargea  to  g(x)  on  10,1]  consists  of 
showing  that  equation  (2.5)  approaches  aero  at  n  becomes  infinite. 

Since,  by  hypothesis,  the  6^'s  are  bounded  and  ®n  >  0 *  “n+i  >  °* 

then  (a  /a  _,,)  is  also  bounded.  Moreover,  the  derivative  g'  is 
n  n+1 

con  tinuous  on  [0,1]. 

That  is, 

g(t)-g(x) 

(2.6)  g'(x)  -  limit  -  , 

t  X  t  -  X 

and  from  equations  (1)  and  (1.4),  it  follows  that 


(2.7) 


limit  c! 

j  -►  «  J 


limit 
t  x 
j  -*•  * 


*(t)-g(x) 

t  -  X 


e^Cx)  dx 


0. 


where  the  index  j  denotes  either  n  or  n+1  in  equation  (2.5).  Therefore, 

S  (x)  approaches  g(x)  for  every  xe[0,l]  and  the  expansion  of  g(x)  can 
n 

be  written  as 


(2.8)  g(x)  -  Y.  V  V*)’  for  «[0,1]. 

i-0 

In  the  derivation  of  the  method  of  approximating  probability 
functions  defined  on  finite  domains,  the  following  theorem  for  weighted 
orthonormal  polynomials  will  be  required: 

Theorem  1.  Let  H(x)  denote  a  polynomial  of  degree  m  that  is 
nonnegative  on  [0,1].  Let  9.(x),  iB0,l,.<>,  be  the  orthonormal 
polynomials  corresponding  to  the  weight  function  w(x)  on  [0,1].  Let 
qi(x),  i  -  0,1,...,  be  the  orthonormal  polynomials  associated  with  the 

weight  function  H(x)w(x).  Then  boundedness  of  the  S^s  assures  the 
boundedness  of  the  q^'s. 
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Proof.  The  product  H(x)  q^(x) 
and  can  be  expreesed  in  the  form 

n+m 

(3)  H(x)  q  (x)  -  £ 

n  i-0 


ia  a  polynomial  of  degree  n+m. 


cni 


et  (x)  , 


where 


cni 


ei(x)  dx. 


If  i  <  n,  then  cni  ■  0  aa  a  conaequence  of  the  orthogonality  propertiea 
of  q  (x)  with  respect  to  the  weight  function  H(x)  w(x) .  So  that, 

n+m 

(3.1)  H(x)  qft(x)  «  £  cni  91(x). 

i*n 

Aa  for  the  coefficienta  cfti  which  do  uot  vanish, 

A 

(3.2)  |cni|  -  /  H(x)|qn(x)|-[w(x)]1/2|e1(x)|  dx 

0 

1  A 

cni  -  /  3 2  tqn(x)3Z  dx  *  J  w(x)t  6i<x)32  dx* 

0  0 

The  laat  expression  follows  from  Schwarz's  inequality,  and  the  last 
Integral  ia  equal  to  1,  since  the  0^'s  are  normalized. 

Let  G  -  Max  [H(x)],  then 
xe[0,1] 

(3.3)  cni  -0/  w<*)  H<x)  [qn(x)]2  dx  -  G. 

0 

So  that,  lcni!  i  and  aloe*  bounded,  that  is  |e^|  <  A, 

for  all  xe[0.1],  we  have  from  equation  (3.1). 
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(3.4) 


|H(x)  qn(x)  |  -  |H(x)  |  • 


qn(x)  |  <,  G1^2  A  (m+1) . 


The  polynomial  H(x)  by  hypothesis  has  a  lover  positive  bound  on 
[0,1];  therefore,  the  polynomials  qn(x)  are  also  bounded  on  [0,1]. 

III.  APPROXIMATING  PROBABILITY  DENSITY  FUNCTIONS  ON  10.11 .  The 
information  and  results  discussed  in  Section  II,  is  next  utilised  in 
the  formulation  of  a  method  for  constructing  nonnegative  approximations 
of  continuous  probability  density  functions  defined  on  the  closed  domain 
[0,1]. 

A.  Assumptions  and  Notations 

1.  Let  f(x)  denote  a  probability  density  function,  and 
f'(x)  its  first  derivative,  and  assume  that  both  f(x) 
and  f'(x)  are  continuous  on  [0,1]. 

2.  It  is  assumed  that  the  first  r-moments,  m. ,  j  -  0,1, 

...,r,  of  f(x)  are  known.  J 

3.  let  P£(x)  denote  a  polynomial  of  degree  r. 

4.  It  is  assumed  that  the  orthonormal  polynomials 
{^i(x)}  ,  associated  with  weight  function  w(x),  are 

known. 

£.  The  Initial  Approximation 

The  probability  density  function  f(x),  by  equation  (2.8), 
can  be  represented  by  the  following  expansion: 


(4) 


f(x) 


w  (x) 


g(x)  «  £  e1(x),  for  xe[0,l] 

1-0 


$ 

2 

5 


3 

1 


or 


f(x)  -  w(x)  ci-  e^(x) 

i-0 


The  coefficients  c^'s  are  computed  from  the  relation 

(4.1)  c  «/  M*)  f00  dx< 

1  J  0 
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Now  that  ^(x)  is  a  polynomial  degree  i,  it  can  be  written  as 


(4.2) 


Vx)  "  Z  8ijxj‘aii>0’ 

J-0 


and  equation  (4.1)  can,  therefore,  be  expressed  as 

i  .1 


(4.3) 


or  by 


(4.4) 


where 


:i  *  t  ‘i if 

j-0  C 


xJ  f(x)  dx 


I  *n  "j  ■ 


J-0 


»  j  X*  f(x)  dx,  (J-0,1 . r). 

0 

The  finite  set  of  moments  a^,  (j-0,1 . r),  are  reproducible 

from  the  expansion  given  by  equation  (4) •  That  is , 

A  J  A 

m j  -  J  x^  f(x)  dx  -  Ci  J  ®i^  dx 


(4.5) 

00  A 

+  Z  ci  J  ei.M dx* 

i-j+1  0 

By  property  [Pj],  the  last  integral  is  zero;  therefore, 

j  r1 

(4.5.1)  Bj  -  Z  ci  J  ^(x)  dx*  (J"0'1 . r> 

i-0  0 
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As  a  consequence  of  eauation  (4.S);  rh*  approximation  fcr 

the  density  function  f (x) ,  based  on  its  r-noments  m  ,  has  the  following 
form  r 

r 

(4.6)  f(x)~w(x)  ^  c1  •  ^(x), 

i-0 


Moreover,  it  is  observed  that  the  reproducibility  of  the  moments 
possessed  by  f  (x)  ,  is  independent  of  the  choice  of  the  weight  function. 

C.  Successive  Improvements  Over  Initial  Approximation 

It  may  happen  that  the  initial  approximation  may  become 
negative  on  [0,1],  and  in  this  event  it  is  not  a  satisfactory  rep¬ 
resentation  of  the  given  probability  density  finction  f(x).  The 
following  approximation  scheme  is  Introduced  eo  es  to  remove  the 
possibility  of  obtaining  a  nagativa  approximation  for  f(x).  For  this 
purpose,  the  initial  approximation  can  be  rewritten  aa 


r 

(4.7)  f(x)ss  ffl(x)  -  w(x)  £  c161(x), 

1-0 

If  fQ(x)  Is  nonnegative  on  [0,1],  then  the  approximation  of 

f(x)  by  ff 'x)  possesses  the  same  first  r-raoments  as  possessed  by 

f (x),  by  equation  (4.5),  and  the  process  Is  therefore  terminated 
at  this  step.  However,  If  this  Is  not  the  case,  then  a  positive 
constant  can  be  determined  such  that 


(4.7.1) 


w(x) 

w.  (x)  -  - 

hj+1 


ci  ei(x) 


•  0,  for  xe[0,l] . 


(The  method  by  which  >  0  Is  determined  is  presented  in  subsection  F.) 

The  first  improvement  over  the  initial  approximation  fQ(x)  is 
obtained  by  constructing  a  new  sequence  of  orthonormal  polynomials 

{qf ^  (x) ,  i-0,1,  ...,  r}  with  respect  to  the  new  weight  function  w. (x) 

1  (1)  i 
(the  sequence  {q^  }  can  be  obtained  by  applying  the  Schmidt  ortho¬ 

normalization  proceaa  (Ref  3)  and  then  computing  a  new  approximation 
by  applying  equations  (4.4)  and  (4).  The  new  approximation  has  the 
following  form 
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(4.8)  f(x) ~  f1(x)  -  Wl(x)  £  ci1)  qi1>  (x) 

i-0 


for 


(x)  f(x)  dx  - 


Moreover , 


(4.8.1)  wx(x)  £  Cl(1>.  qjL(x) 
1-0 


w(x) 


hl+  ci  6i(x) 


t^+1 


r 


i-0 


(x) 


or 


f1(x)  -  w(x)  ^ 2r^x^ ' 

and  since  P,  (x)  is  a  polynomial  of  degree  2r  we  can  write  equation 
(4.8)  as  iT 


2r 

(4.8.2)  f(x)~f1(x)  -  w(x)  ^  dt  6i(x) 

i-0 

where , 

d±  m  f  91(x)  f(x)  dx, 

0 

If  f.(x),  equation  (4.8),  is  nonnegative  on  [0,1],  then  the 
process  is  terminated  with  the  first  improvement  over  the  initial 
approximation.  If,  however,  f^(x)  becomes  negative  on  [0,1],  then  a 

second  positive  constant  h^  is  determined  and  the  computations  in¬ 
dicated  by  equations  (4.7.1)  and  (4.8)  are  repeated. 
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1 


The  results  obtained  after  repeating  the  above  process  n  tines 
tz  expressed  i>y  the  following  relations: 


(4.9) 


f(x)  2-  fn(x) 


E  «rw  <»> 

1-0 


where , 


(x)  f(x)  dx  - 


1 


J-o 


(4.9.1) 


w  (x) 
n 


Vi(x) 


V1 


(n-1)  (n-1) 

ci  qi 


w  (x)  -  w  .(x)  •  P  Tx) 

n  n-x  r 


w  (x)  -  w(x)  •  P  (x) 
n  nr 


(4.9.2)  f(x)=*fn(x)  -  w(x)  •  P(n+1)r  (x) 

(ti+1)  r 

di  61(x) 
i-0 

where  ^ 

d i  m  f  »±(x)  f(x)  dx‘ 

■'o 


fn(x)  -  w(x)  \ 


D.  Convergence  of  Process. 

The  convergence  of  the  above  process  can  be  demonstrated 
by  applying  equation  (1.2)  along  with  the  following  replacements  or 
substitutions : 


3 


k 
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mm  v jiltolii Qi UMi  i *  ••  ■**»*»# 


1. 


gU)  * 


£00 

w(x) 


•  ci  =  di 


2.  Sn(x)  -  fn(x)  ,  n  -  (n+l)r 


Equation  (1.2)  than  becomes , 


(A. 10) 


j  w(x) 


f(x) 

w(x) 


fn(*) 


dx 


■J 


1  f2(x) 


w(x) 


<n+l)r 

dx  -  di  > 

1-0 


(n+l)r 


^  d^  is  convargant  provided 


and 

1-0 

[0,1)  as  n  -*■  also 


f2(x) 

w(x) 


la  integrable  over 


limit  d.  ■  0. 
n  » 


E.  The  Error  E*. 

Having  assumed  that  the  finite  set  of  moments  {m  ^  ,  j-0,1 . r} 

are  known,  we  then  essentially  carry  out  the  approximation  f  (x)  by 

/  \  “ 

applying  equation  (4.9).  The  coefficients  are  computed  from  the 

given  set  of  moments  and  it  appears  natural  to  measure  the  error  of  the 
approximation  in  terms  of  the  moments. 

The  error  is  defined  by 


(4.11) 

E<”>. 

><">  -  .j  .  for  J-0,1 . r 

with 

w: 

x^  fn(x)  dx, 

xJ  w(nj  (x)  dx. 

Next  let, 
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(4.11.1)  fx(x)  -  wxU)  - 


h^l 


hjw(x)+w(x)  £  •  ei(x) 

1-0 


then 


i(l>. 


h.j+1 


and 


E<1>  - 


\n 


*<0>  -  m 

6j  aJ 


,<2) 


The  errors  Ej  ere  determined  as  follows: 


(4,11.2)  f2(x)  -  w2(x) 


h2+l 


h2wl(j°+wi (x)  Z  ciL) 


(1) 


i-0 


So  that 


^  h  **u  +  "j 


and 


s<2)  . 

1  h2+l 


,<D 

62  '  “j 


,(n) 


form 


The  error  Ej  ,  associated  with  the  n-th  approximation  has  the 


(4.12) 


,(n) 


hn+1 


«<-»  -  a 
J  J 


i  ( j— 0 ,1, . . . ,r) 


and  this  error  approaches  zero,  as  n  -*■  ®,  provided  that  -*•  0. 


(x) 
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F.  Construction  of  Positive  Constants  h  • 

n 

It  has  previously  bean  shown  that  the  approximation  fft(x), 

equation  (4.9)  ,  has  the  same  first  r -mo manta  that  are  possessed  by 
f(x).  In  order  that  fn(x)  represent  a  probability  density  function 

it  Is  necessary  that  ffl(x)  be  nonnegative  on  the  domain  [0,1]. 

The  nonnegativity  of  the  approximation  is  next  considerad. 

The  approximation,  at  the  n-th  step,  can  be  expressed  as  follows: 


(5)  f(x)~  w(x)  h(x)  ,  xe[0,l] 

where , 

r 

(5.1)  h(x)  -  £  qt(x) 

i-0 

and 

i 

(5.2)  q^x)  ■  £  atj  x^ 

J-0 

By  substituting  qi(x)  equation  (5.2)  in  equation  (5.1),  h(x)  can  be 
written  in  the  form  of  a  polynomial.  This  is, 

r 

(5.3)  h(x)  -  ]T  Ari  x1 

i-0 

for 

r 

<5-4>  Ari  "  I  ci  aij  • 

j-i 


By  definition  w(x)  is  nowhere  negative  on  [0,1],  and  if  the 
approximation  becomes  negative  on  (0,1)  it  is  duo  to  h(x)  being 
negative  on  that  interval.  Therefore,  if  the  polynomial  h(x)  has  a 
real  root,  of  order  one,  on  (0,1)  it  implies  there  exist  at  least  one 
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point  xq  ^u,ij  sucn  cnat  nvxQ;  <  u.  me  roiiowing  two  theorems  can 
be  applied  to  determine  the  possibility  of  a  real  root  of  h(x)  on  (0,1). 

Theorem  2:  On  the  upper  bound  for  the  real  roots  of  a  polynomial. 

Let  R(x)  be  a  polynomial  of  degree  r,  and  let  (x-1)  >  0  for 

(It)  ~ 

(k-0,1, . . . ,r) ,  where  R'  denotes  the  k-th  derivative  of  polynomial 

R(x).  Then  the  point  x-1  is  an  upper  bound  for  the  real  roots  of  R(x). 

Proof :  By  Taylor's  formula,  the  polynomial  R(x)  can  be  expanded 
about  the  point  x-1.  That  is. 


R(x)  -  R(x-l)  +  £ 


R(k)  (x-1). 


By  hypothesis 


Rv  '  (x-1)  >.0,  for  (k-0,1 . r). 


Hence,  R(x)  for  x  >  1,  by  equation  (S.5)  is  also  positive.  Therefore* 
x-1  is  an  upper  bound  for  all  the  real  roots  of  R(x). 

Theorem  3;  On  the  lower  bound  for  the  positive  real  roots  of  a 


(5.10) 

R(x)  -  xr  (b  x”r  +  ...  +  b  ],  or 
o  r  ' 

(5.11) 

R(x)  »  b  -fb.x  +  ...  +  b  xr,  or 
oi  r 

(5.12) 

R(X)  -  bo(l-XjX)  (l-x2x)  ...  (l-xrx) 

The  polynomial  R(x)  aquation  <5. 11)  almply  reverses  the  order  of  the 
coefficients  of  h(x)  and  the  roots  of  R(x)  equation  (5.12)  are  simply 
the  reciprocals  of  the  roots  of  h(x). 

If  no  real  root  of  h(x)  lies  In  (0,1),  then  no  real  root  of  R(x) 

Ilea  In  the  Interval  (1,«)  and  by  theorem  2,  x«l  is  an  upper  bound  for 
the  real  roots  of  R(x).  Moreover,  x-1  la  a  lower  bound  for  the  poeltlve 
real  roots  of  h(x). 

Proof  8  The  roots  of  R(x)  are  the  elements  of  the  set  [1/x^: 


(1-1,2 . r) 

]  .  Moreover, 

(5.13) 

(1).  xi  <  0  Implies 

1 

7”  <  0,  and 
*1 

(5.14) 

(2).  x^  >  1  Implies 

1 

c'1- 

Hence,  (1)  and  (2)  together  imply  that  R(x)  has  no  real  roots  In  the 
open  Interval  (1,  “) .  Therefore,  x-1  Is  an  upper  bound  for  the  real 
roots  of  R(x).  Moreover,  x  ■  1.  Is  a  lover  bound  for  the  positive  real 
roots  of  h(x)  which  Implies  that  h(x)  has  no  real  roots  on  the  Interval 
(0,1). 


By  the  application  of  theorems  (2)  and  (3),  a  test  can  be  con¬ 
structed  to  determine  the  positivity  of  the  polynomial  h(x)  on  (0,1). 
From  equation  (5.3)  h(x)  la  defined  as 

r 

(5.15)  b(x)  -  £  Ari  x1 

1-0 

Let  R(x)  be  the  polynomial 

(5.16)  R(x)  -  xr  hjj)  -  £  Arl  xr_l  . 

1-0 
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Then  the  k-th  derivative  of  R(x)  is  denoted  by 


(5.17) 


R(k)  (x) 


(r-k) 


o  ( r-i ; ; 

£b  (r-i-k) 


(r-i) ;  r-i-k 


or 


(5.18) 


(x)  -  —  A  X' 


31 


J 


ro 


;j+  £ 

iti  «-*>* 


<£1>JL  A  x^1 

ri  x 


and  if  h(x)  has  no  real  roots  on  (0,1),  then  by  theoran  (3), 


(5.19) 


R*J)  (x-1)  -  — 


jl 


Aro  + 


i-1 


(r-i)l 

(j-D! 


Atl  i  0. 


That  is, 


(5.20) 


ro 


> .  1L  f  ± 

rI  i-i 


(r-i)l 


•1)1 


ri 


for  (3*0,1,  ...»  r) . 

When  the  relation  on  Aro  (equation  5.20)  is  satisfied  the 

approximation,  equation  (5),  is  positive  on  (0,1)  and  represents  the 
probability  density  function. 

However,  if  (x-1)  <  0,  for  any  value  of  j,  then  h(x)  must 

be  modified  such  that  its  modified  form  becomes  a  positive  function 
on  (0,1).  Essentially,  the  constant  term  Azq  is  increased  by  some 

positive  constant  h  until  equation  (5.20)  is  satisfied  for  all  values 
of  j-0,1,. .. ,r. 

The  product  of  w(x)  and  the  modified  positive  function  h(x) 
produces  a  new  weight  function  and  this  new  weight  function  is  then 
used  to  generate  a  new  set  of  orthonormal  polynomials  (q^(x):  i-0,l,...,r) 

needed  to  obtain  the  next  Improved  approximation.  The  process  is 
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terminated  when  the  succeeding  approximation  becomes  positive  on  (0,1) 
and  the  succeeding  weight  function  has  moments  that  are  arbitrarily 
close  to  the  momentB  of  f (x) . 
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